Skip to content

getpapers 'JavaScript heap out of memory error' #191

@J-E-J-S

Description

@J-E-J-S

I am hoping to do an extensive mine (~1 million papers ultimately). I am attempting a 100k paper mine first, however, I am running into memory errors which I would guess are due to the number of papers attempting to download?
The error I recieve is:

 getpapers -q biotechnology -a -k 100000 -o 100000_biotechnology
info: Searching using eupmc API
(node:14840) Warning: Accessing non-existent property 'padLevels' of module exports inside circular dependency
(Use `node --trace-warnings ...` to show where the warning was created)
info: Found 1030125 results
warn: This version of getpapers wasn't built with this version of the EuPMC api in mind
warn: getpapers EuPMCVersion: 5.3.2 vs. 6.5 reported by api
info: Limiting to 100000 hits
Retrieving results [==========--------------------] 34% (eta 532.9s)info: EuPMC gave us the wrong hitcount. We've already found all the results
info: Duplicate records found: 33709 unique results identified
info: Saving result metadata

<--- Last few GCs --->

[14840:0000020982D837B0]   308443 ms: Scavenge 1962.9 (2049.8) -> 1961.1 (2064.3) MB, 21.1 / 0.0 ms  (average mu = 0.938, current mu = 0.624) allocation failure
[14840:0000020982D837B0]   308534 ms: Scavenge 1975.6 (2064.3) -> 1975.0 (2065.8) MB, 10.6 / 0.0 ms  (average mu = 0.938, current mu = 0.624) allocation failure
[14840:0000020982D837B0]   308568 ms: Scavenge 1976.9 (2065.8) -> 1975.1 (2079.8) MB, 23.3 / 0.0 ms  (average mu = 0.938, current mu = 0.624) allocation failure


<--- JS stacktrace --->

FATAL ERROR: MarkCompactCollector: young object promotion failed Allocation failed - JavaScript heap out of memory
 1: 00007FF7F0CA058F napi_wrap+109311
 2: 00007FF7F0C452B6 v8::internal::OrderedHashTable<v8::internal::OrderedHashSet,1>::NumberOfElementsOffset+33302
 3: 00007FF7F0C46086 node::OnFatalError+294
 4: 00007FF7F151153E v8::Isolate::ReportExternalAllocationLimitReached+94
 5: 00007FF7F14F63BD v8::SharedArrayBuffer::Externalize+781
 6: 00007FF7F13A084C v8::internal::Heap::EphemeronKeyWriteBarrierFromCode+1516
 7: 00007FF7F138B48B v8::internal::NativeContextInferrer::Infer+59243
 8: 00007FF7F13709BF v8::internal::MarkingWorklists::SwitchToContextSlow+57327
 9: 00007FF7F138460B v8::internal::NativeContextInferrer::Infer+30955
10: 00007FF7F137B72D v8::internal::MarkCompactCollector::EnsureSweepingCompleted+6269
11: 00007FF7F138385E v8::internal::NativeContextInferrer::Infer+27454
12: 00007FF7F13877EB v8::internal::NativeContextInferrer::Infer+43723
13: 00007FF7F1391042 v8::internal::ItemParallelJob::Task::RunInternal+18
14: 00007FF7F1390FD1 v8::internal::ItemParallelJob::Run+641
15: 00007FF7F13648D3 v8::internal::MarkingWorklists::SwitchToContextSlow+7939
16: 00007FF7F137BBDC v8::internal::MarkCompactCollector::EnsureSweepingCompleted+7468
17: 00007FF7F137A424 v8::internal::MarkCompactCollector::EnsureSweepingCompleted+1396
18: 00007FF7F1377F88 v8::internal::MarkingWorklists::SwitchToContextSlow+87480
19: 00007FF7F13A65D1 v8::internal::Heap::LeftTrimFixedArray+929
20: 00007FF7F13A86B5 v8::internal::Heap::PageFlagsAreConsistent+789
21: 00007FF7F139D961 v8::internal::Heap::CollectGarbage+2033
22: 00007FF7F139BB65 v8::internal::Heap::AllocateExternalBackingStore+1317
23: 00007FF7F13B5E06 v8::internal::Factory::AllocateRaw+166
24: 00007FF7F13C9824 v8::internal::FactoryBase<v8::internal::Factory>::NewFixedArrayWithFiller+84
25: 00007FF7F13C9775 v8::internal::FactoryBase<v8::internal::Factory>::NewFixedArray+69
26: 00007FF7F12161B1 v8::internal::LayoutDescriptor::Trim+2065
27: 00007FF7F121A339 v8::internal::LayoutDescriptor::Trim+18841
28: 00007FF7F1216313 v8::internal::LayoutDescriptor::Trim+2419
29: 00007FF7F1219C48 v8::internal::LayoutDescriptor::Trim+17064
30: 00007FF7F1219B72 v8::internal::LayoutDescriptor::Trim+16850
31: 00007FF7F12D37F7 v8::base::TimeDelta::operator!=+11847
32: 00007FF7F12CFD20 v8::internal::TimedHistogram::Stop+16976
33: 00007FF7F12CECFC v8::internal::TimedHistogram::Stop+12844
34: 00007FF7F12D3A1C v8::base::TimeDelta::operator!=+12396
35: 00007FF7F12CFD20 v8::internal::TimedHistogram::Stop+16976
36: 00007FF7F12D02F3 v8::internal::TimedHistogram::Stop+18467
37: 00007FF7F12D25B6 v8::base::TimeDelta::operator!=+7174
38: 00007FF7F1483ED7 v8::internal::Builtins::builtin_handle+81719
39: 00007FF7F1599FCD v8::internal::SetupIsolateDelegate::SetupHeap+464173
40: 00007FF7F15328D2 v8::internal::SetupIsolateDelegate::SetupHeap+40498
41: 00007FF7F15328D2 v8::internal::SetupIsolateDelegate::SetupHeap+40498
42: 00007FF7F15328D2 v8::internal::SetupIsolateDelegate::SetupHeap+40498
43: 00007FF7F15328D2 v8::internal::SetupIsolateDelegate::SetupHeap+40498

Nothing is ultimately downloaded into the output directory, attempts to restart fail.

Is there anyway around this or a fix? Would this be improved if I increased the RAM on the machine (currently 8gb)?

Many Thanks,
James

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions