Task #1722
Split content kind into content and contentSupplement2 in order to speed up getDocumentAPI
100%
Description
Made around 30 calls to getDocumentAPI for a short (2 line) and long (about 70 lines) document and got the following results:
Total calls Average Max 60th 80th 95th 98th
longer 30 58.16666667 151 60 70 151 151
shorter 35 47.69444444 144 45 55 128 144
This gives us sufficient data to ensure that splitting content such that unnecessary fields such as cnh are not retrieved during getDocumentAPI call should make it more efficient.
Going forward, content will only have the following fields:
allSectionNames
allSectionNamesAndUrls
baseUrl
br2ir2
breadcrumbStructuredData
data
documentId
documentProcessingId
docViewCount
heading
indexName
lUrl
nextDocument
nextSectionName
oUrl
previousDocument
prevSectionName
subSections
And, contentSupplement2 will contain the following fields:
Idvalues
url
updatedBy
updatedAt
createdBy
createdAt
checksum
links
cnh
s
book
status
rank
JFYI that we were not able to use projections because some of the required fields are >1500 bytes and this causes compositeIndex based projection operations to be not useful.
More notes:
- projection query is not suitable for getDocument Api since allSectionNames and allSectionNamesAndUrls are multi-valued fields . according to the documentation using multi-valued fields in a projection will return a separate entity for each combination.
- some fields with data if size > 1500 bytes are excluded from indexing when we save it to datastore. such fields cannot be used in composite indexes used for projection queries according to documentation
--Reference: https://cloud.google.com/datastore/docs/concepts/indexes#:~:text=Composite%20indexes%20are%20composed%20of%20multiple%20properties%20and%20require%20that%20each%20individual%20property%20must%20not%20be%20excluded%20from%20your%20indexes.
--Reference: https://cloud.google.com/datastore/docs/concepts/storage-size#composite_indexes