OutOfmemory exception when processing a large number of documents
Hello,
I want to process a large number of documents (around 3 million). The process consists of updating a metadata of each document from another metadata (so I will need to do a read and write from the database).
For that, I was thinking of using elasticsearch's SCROLL API.
The problem is that I will have a “java.lang.OutOfMemoryError: GC overhead limit exceeded” exception in the middle of processing (knowing that I have xmx = xms = 24g in JAVA_OPTS)
I have tried different configuration for the garbage collector but no great effect noticed.
Can someone help me or give me an idea how to process a large batch of documents in nuxeo.
Thank you in advance.
Can you help me how to use bulk upload with custom metadata?
What type of bulk upload do you use ?
for example, the following doc https://doc.nuxeo.com/nxdoc/nuxeo-bulk-document-importer/ explains how to use Nuxeo Bulk Document Importer.
there are other types like for example the csv importer…