Bug #2787
sitemap URL with non-alphanumeric chars is not working
100%
Description
I ingested the book(book name with special characters). There are no errors occurred during ingestion. But I checked the sitemap files for those books. Sitemap URL for those books are present in the public URL. But when we open it, it is showing error message. I think that is the issue.
A) sitemap file for the book where the bookName has special characters(like colon, semicolon, brackets, paranthesis) and alphanumeric:
(showing Error message when visiting the sitemap URL)
ex:
1. Tensorflow API Doc TF Module (Python)-PL-2022-10-30 16:23:38.140171-SPL
- https://edutestdev-240612.appspot.com/sitemap/edutestdev_sitemap/sitemap-tensorflow-api-doc-tf-module-%28python%29-2022-11-12-14%3A27%3A00.927566-sg12nov22.xml
2. https://edutestdev-240612.appspot.com/sitemap/edutestdev_sitemap/sitemap-python-python-for-beginners-%28full-course%29-dec01ak.xml
3. https://edutestdev-240612.appspot.com/sitemap/edutestdev_sitemap/sitemap-deep-learning-andrew-ng%2C-coursera-course-dsnov24ko.xml
B) sitemap file for the book where the book name has only alphanumeric characters:
(Sitemap URL works fine)
ex:
1. https://edutestdev-240612.appspot.com/sitemap/edutestdev_sitemap/sitemap-python-3-tutorial-prd.xml
2. https://edutestdev-240612.appspot.com/sitemap/edutestdev_sitemap/sitemap-python-3-language-reference-prd.xml
3. https://edutestdev-240612.appspot.com/sitemap/edutestdev_sitemap/sitemap-ticket-2738-fix.xml
also check more URLs in edutestdev_sitemap - public URL:
https://storage.cloud.google.com/edutestdev_sitemap/sitemap.xml
Files