Allow passing in cr/cl bounds and other settings by winston-zillow · Pull Request #6 · 12wang3/rrl

winston-zillow · 2021-12-07T00:13:50Z

Fix execution on CPU and GPU. Fix model loading.

Allow CPU execution. Fix GPU support. Fix module loading.

winston-zillow · 2021-12-07T00:14:57Z

README.md

 ```bash
 # trained on the tic-tac-toe data set with one GPU.
-python3 experiment.py -d tic-tac-toe -bs 32 -s 1@16 -e401 -lrde 200 -lr 0.002 -ki 0 -mp 12481 -i 0 -wd 1e-6 &
+python3 experiment.py -d tic-tac-toe -bs 32 -s 1@16 -e401 -lrde 200 -lr 0.002 -ki 0 -mp 12481 -i cuda:0 -wd 1e-6 &


Note: see review comment on args.py changes

winston-zillow · 2021-12-07T00:19:39Z

args.py

 rrl_args.test_res = os.path.join(rrl_args.folder_path, 'test_res.txt')
-rrl_args.device_ids = list(map(int, rrl_args.device_ids.strip().split('@')))
+rrl_args.device_ids = list(map(lambda id: torch.device(id), rrl_args.device_ids.strip().split('@'))) \
+    if rrl_args.device_ids else [None]


Note: I found that passing in integer device ID would get the tensors pegged to the GPU memory but the GPU compute utilization remains at 0, as shown by nvidia-smi. After I change the device ID to that returned by torch.device("cuda:0"), the GPU is utilized fully. I do not know why that's the case as simple test using a python loop can cause GPU utilization.

Example run passing in integer device ID:

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.142.00 Driver Version: 450.142.00 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla K80 On | 00000000:00:1E.0 Off | 0 | | N/A 47C P0 70W / 149W | 322MiB / 11441MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 27173 C ...vs/pytorch_p37/bin/python 319MiB | +-----------------------------------------------------------------------------+

Example run passing in cuda:*:

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 27346 C ...vs/pytorch_p37/bin/python 1736MiB | +-----------------------------------------------------------------------------+ Sat Dec 4 01:31:31 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.142.00 Driver Version: 450.142.00 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla K80 On | 00000000:00:1E.0 Off | 0 | | N/A 52C P0 138W / 149W | 1739MiB / 11441MiB | 100% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 27346 C ...vs/pytorch_p37/bin/python 1736MiB | +-----------------------------------------------------------------------------+

winston-zillow · 2021-12-07T00:21:23Z

experiment.py

+            # lower_bound: [continuous cols]
+            # upper_bound: [continuous cols]
+        }
+    return settings


Note: I added this new setting file so that the user can pass in CR/CL bounds as well as controlling normalization and one-hot encoding etc. (those are currently hard-coded)

winston-zillow · 2021-12-07T00:22:20Z

rrl/components.py

-            if self.left is not None and self.right is not None:
+            if cl is not None and cr is not None:  # bounds are specified
+                cl = torch.tensor(cl).type(torch.float).t()
+                cr = torch.tensor(cr).type(torch.float).t()


Note: here we can pass in the cl/cr bounds directly.

winston-zillow · 2021-12-07T00:22:38Z

rrl/components.py

                cl = 3. * (2. * torch.rand(self.n, self.input_dim[1]) - 1.)
                cr = 3. * (2. * torch.rand(self.n, self.input_dim[1]) - 1.)
+            assert torch.Size([self.n, self.input_dim[1]]) == cl.size()
+            assert torch.Size([self.n, self.input_dim[1]]) == cr.size()


Note: and verify the shapes are correct.

winston-zillow · 2021-12-07T00:24:08Z

rrl/models.py

+                        estimated_grad=estimated_grad)

-        self.net.cuda(self.device_id)
+        if self.device_id and self.device_id.type == 'cuda':


Note: the condition allows the program to run in CPU mode as well.

winston-zillow · 2021-12-07T00:26:03Z

rrl/utils.py

-        self.feature_enc = preprocessing.OneHotEncoder(categories='auto', drop=drop)
-        self.imp = SimpleImputer(missing_values=np.nan, strategy='mean')
+        self.feature_enc = preprocessing.OneHotEncoder(categories='auto', drop=drop) if one_hot_encode_features else None
+        self.imp = SimpleImputer(missing_values=np.nan, strategy='mean') if impute_continuous else None


Note: for dataset not requiring or already have one-hot encoding or imputation, they can now be skipped.

12wang3 · 2021-12-07T04:00:03Z

Thank you very much for the PR. I am busy on other stuff now and will check the code after Dec 9.

ASan1527 · 2022-11-08T06:30:45Z

I cant catch the device_ids, and I only have the single gpu, I don't know how to change the code. Could you please tell me to solve it? thank you!

12wang3 · 2022-11-08T07:01:51Z

I cant catch the device_ids, and I only have the single gpu, I don't know how to change the code. Could you please tell me to solve it? thank you!

Could you please show the command you used? Have you set the "-i" argument? It seems that you did not set the device_ids since your device_ids was None. If you only have one single GPU, you can use "-i 0" to set the device_ids. By the way, maybe we should use issue rather than PR to discuss questions.

winston-zillow added 3 commits December 6, 2021 15:49

Allow passing in cr/cl bounds and settings.

d590e4b

Allow CPU execution. Fix GPU support. Fix module loading.

Merge remote-tracking branch 'origin/main'

035bc44

Update README

9ddf68a

winston-zillow commented Dec 7, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow passing in cr/cl bounds and other settings#6

Allow passing in cr/cl bounds and other settings#6
winston-zillow wants to merge 3 commits into12wang3:mainfrom
winston-zillow:main

winston-zillow commented Dec 7, 2021

Uh oh!

winston-zillow Dec 7, 2021

Uh oh!

winston-zillow Dec 7, 2021

Uh oh!

winston-zillow Dec 7, 2021

Uh oh!

winston-zillow Dec 7, 2021

Uh oh!

winston-zillow Dec 7, 2021

Uh oh!

winston-zillow Dec 7, 2021

Uh oh!

winston-zillow Dec 7, 2021

Uh oh!

12wang3 commented Dec 7, 2021

Uh oh!

ASan1527 commented Nov 8, 2022

Uh oh!

12wang3 commented Nov 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

winston-zillow commented Dec 7, 2021

Uh oh!

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

Uh oh!

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

Uh oh!

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

Uh oh!

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

Uh oh!

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

Uh oh!

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

Uh oh!

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

Uh oh!

12wang3 commented Dec 7, 2021

Uh oh!

ASan1527 commented Nov 8, 2022

Uh oh!

12wang3 commented Nov 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants