Skip to content

Latest commit

 

History

History
44 lines (31 loc) · 1.76 KB

File metadata and controls

44 lines (31 loc) · 1.76 KB

preprocessing_profiling.ProfileReport(df, format_missing_values=True, model=“DecisionTreeClassifier”)

Parameters

df: pandas dataframe

The dataset itself.

format_missing_values: boolean, optional

When format_missing_values is True, strings that (probably) represent missing values (such as “?”, “na”, “n/a”...) will be replaced with numpy.NaN.

model: string or scikit-learn classifier, optional

The model which will be trained and used to predict the classes of the entries in the test set. Can be either a scikit-learn model or a string with the name of a scikit-learn model.

Currently accepted strings

  • “MLPClassifier”
  • “KNeighborsClassifier”
  • “SVC”
  • “GaussianProcessClassifier”
  • “DecisionTreeClassifier”
  • “RandomForestClassifier”
  • “AdaBoostClassifier”
  • “GaussianNB”
  • “QuadraticDiscriminantAnalysis”
  • “DummyClassifier”

Returns

A ProfileReport object which contains methods to display the report in various ways.

Methods

to_file(outputfile=”report.html”)

Writes the report to a file. The optional parameter outputfile defines the path to the file.

_repr_html_()

When the object is returned to IPython, this method will be called and it will return the report in a format that can be displayed inside, for instance, a Jupyter Notebook cell. The report will have a download button, allowing the user to save the report.*

*The download button may not work in Google Chrome because of a compatibility issue between Jupyter Notebook (version 5 or greater) and Chrome. To be able to save the report, use the to_file method instead.

to_html()

Returns a string with the html of the report.