-
Notifications
You must be signed in to change notification settings - Fork 58
Open
Description
The environment is Centos7.4 with Cuda9.0 and one GeForce GTX 1080Ti.
- Run map + reduce on datasets with 100,000,000 elements - multiple partitions
- Run map + map + reduce on datasets - multiple partitions
- Run map + map + map + collect on datasets
- Run map + map + map + reduce on datasets - multiple partitions
- Run map on dataset with a single primitive array column
- Run map with free variables on dataset with a single primitive array column
- Run reduce on dataset with a single primitive array column
- Run map & reduce on a single primitive array in a structure *** FAILED ***
jcuda.CudaException: CUDA_ERROR_OUT_OF_MEMORY
at jcuda.driver.JCudaDriver.checkResult(JCudaDriver.java:312)
at jcuda.driver.JCudaDriver.cuCtxCreate(JCudaDriver.java:1444)
at com.ibm.gpuenabler.GPUSparkEnv$.get(GPUSparkEnv.scala:143)
at com.ibm.gpuenabler.CUDADSFunctionSuite$$anonfun$47.apply$mcV$sp(CUDADSFunctionSuite.scala:743)
at com.ibm.gpuenabler.CUDADSFunctionSuite$$anonfun$47.apply(CUDADSFunctionSuite.scala:740)
at com.ibm.gpuenabler.CUDADSFunctionSuite$$anonfun$47.apply(CUDADSFunctionSuite.scala:740)
at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
...
- Run logistic regression *** FAILED ***
org.apache.spark.SparkException: Job aborted due to stage failure: Task 5 in stage 1.0 failed 1 times, most recent failure: Lost task 5.0 in stage 1.0 (TID 13, localhost, executor driver): jcuda.CudaException: CUDA_ERROR_INVALID_CONTEXT
at jcuda.driver.JCudaDriver.checkResult(JCudaDriver.java:312)
at jcuda.driver.JCudaDriver.cuModuleLoadData(JCudaDriver.java:2014)
at com.ibm.gpuenabler.CUDAManager$$anonfun$cachedLoadModule$1.apply(CUDAManager.scala:102)
at com.ibm.gpuenabler.CUDAManager$$anonfun$cachedLoadModule$1.apply(CUDAManager.scala:87)
at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)
at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
at com.ibm.gpuenabler.CUDAManager.cachedLoadModule(CUDAManager.scala:87)
at com.ibm.gpuenabler.CUDAManager.getModule(CUDAManager.scala:62)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$JCUDAIteratorImpl.processGPU(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$JCUDAIteratorImpl.hasNext(Unknown Source)
at com.ibm.gpuenabler.MAPGPUExec$$anonfun$doExecute$1.apply(CUDADSUtils.scala:152)
at com.ibm.gpuenabler.MAPGPUExec$$anonfun$doExecute$1.apply(CUDADSUtils.scala:73)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$26.apply(RDD.scala:843)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$26.apply(RDD.scala:843)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
...
Metadata
Metadata
Assignees
Labels
No labels