Hi team,
First of all, thanks to the team for working on building such a good package for us to use.
I follow the example Counterfactual with Reinforcement Learning (CFRL) on Adult Census to build my own CFL.
I have a data set that is a mix of numerical, binary, and category features.
I trained a random forest classification model as the predictor model and ran counterfactualtabluer to generate the counterfactual for features that I am interested in. Below is part of the code on how i specify the candidate features and immutable feature
ranges = {'num_image': [1, 16],
'num_alternative_image': [0,6],
'num_market_bullets':[5,19]
}
from alibi.explainers import CounterfactualRLTabular
explainer = CounterfactualRLTabular(predictor=predictor,
encoder=heae.encoder,
decoder=heae.decoder,
latent_dim=LATENT_DIM,
encoder_preprocessor=heae_preprocessor,
decoder_inv_preprocessor=heae_inv_preprocessor,
coeff_sparsity=COEFF_SPARSITY,
coeff_consistency=COEFF_CONSISTENCY,
category_map=cate_map,
feature_names=model_attr,
#ranges=ranges,
immutable_features=immutable_features,
train_steps=TRAIN_STEPS,
batch_size=BATCH_SIZE,
backend="tensorflow")
explainer = explainer.fit(X=X_train.to_numpy())
X_positive = X_test[np.argmax(predictor(X_test), axis=1) == 1]
X = X_positive[:1000]
Y_t = np.array([0])
#index 20 num_image, 21 num_alternative_image, 22 num_market_bullets. if i put feature name i will get error somehow.
C = [{20: [1, 10],21:[0,6], 22: [5, 10]}]
explanation = explainer.explain(X, Y_t, C)
after I get the counterfactual df I compared it with original df and got the difference columns below. The avg_delivery_days is immutable but also changes though very tiny change, for 'num_image', 'num_alternative_image' , 'num_market_bullets' the change is also minimal. Can I see the changed features play an important role in predicting the label (>0.4 or <=0.4) since a small change and flip the label ? Did i use the right counterfactual function for my use case? :

For tabluar data , do i always need encoder and decoder? if its already binary should i put binary feature in category_map in below function ?
heae_preprocessor, heae_inv_preprocessor = get_he_preprocessor(X=X_train, feature_names=model_attr, category_map=cate_map, feature_types=feature_types)
Another question I have is what function I can use for the environment models, such as boost regression or a regression type of black box model?
If I tried to use
explainer = CounterfactualRLTabular(predictor=predictor,
encoder=heae.encoder,
decoder=heae.decoder,
latent_dim=LATENT_DIM,
encoder_preprocessor=heae_preprocessor,
decoder_inv_preprocessor=heae_inv_preprocessor,
coeff_sparsity=COEFF_SPARSITY,
coeff_consistency=COEFF_CONSISTENCY,
category_map=cate_map,
feature_names=model_attr,
#ranges=ranges,
immutable_features=immutable_features,
train_steps=TRAIN_STEPS,
batch_size=BATCH_SIZE,
backend="tensorflow")
but replace predictor as the boost regression model. What other changes do I need to make since the regression model, the prediction is continuous, how can i customize the reward function?
sorry for all these questions, as i am a starter in RL and is still learning everthing so forgive me if my questions sounds dump.
thanks for your time and help
Hi team,
First of all, thanks to the team for working on building such a good package for us to use.
I follow the example Counterfactual with Reinforcement Learning (CFRL) on Adult Census to build my own CFL.
I have a data set that is a mix of numerical, binary, and category features.
I trained a random forest classification model as the predictor model and ran
counterfactualtabluerto generate the counterfactual for features that I am interested in. Below is part of the code on how i specify the candidate features and immutable featureafter I get the counterfactual df I compared it with original df and got the difference columns below. The avg_delivery_days is immutable but also changes though very tiny change, for 'num_image', 'num_alternative_image' , 'num_market_bullets' the change is also minimal. Can I see the changed features play an important role in predicting the label (>0.4 or <=0.4) since a small change and flip the label ? Did i use the right counterfactual function for my use case? :

For tabluar data , do i always need encoder and decoder? if its already binary should i put binary feature in category_map in below function ?
heae_preprocessor, heae_inv_preprocessor = get_he_preprocessor(X=X_train, feature_names=model_attr, category_map=cate_map, feature_types=feature_types)Another question I have is what function I can use for the environment models, such as boost regression or a regression type of black box model?
If I tried to use
but replace predictor as the boost regression model. What other changes do I need to make since the regression model, the prediction is continuous, how can i customize the reward function?
sorry for all these questions, as i am a starter in RL and is still learning everthing so forgive me if my questions sounds dump.
thanks for your time and help