Replace dask_ml.wrappers.ParallelPostFit with custom ParallelPostFit class#832
Replace dask_ml.wrappers.ParallelPostFit with custom ParallelPostFit class#832ayushdg merged 17 commits intodask-contrib:mainfrom
dask_ml.wrappers.ParallelPostFit with custom ParallelPostFit class#832Conversation
Codecov Report
@@ Coverage Diff @@
## main #832 +/- ##
==========================================
- Coverage 77.44% 75.33% -2.11%
==========================================
Files 71 72 +1
Lines 3600 3779 +179
Branches 634 674 +40
==========================================
+ Hits 2788 2847 +59
- Misses 685 795 +110
- Partials 127 137 +10
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
|
Let me know if you think any tests from Dask-ML's |
| model with a :class:`dask_ml.wrappers.ParallelPostFit`. | ||
| model with a :class:`dask_sql.physical.rel.custom.wrappers.ParallelPostFit`. | ||
| Have a look into the | ||
| [dask-ml docu](https://ml.dask.org/meta-estimators.html#parallel-prediction-and-transformation) |
There was a problem hiding this comment.
Remove this documentation link too and your own documentation in dask_sql. I would like us to move away from wrap_fit functionality to auto-detecting if its a sklearn/single-GPU cuML/xgboost model vs a dask model.
VibhuJawa
left a comment
There was a problem hiding this comment.
Requesed Small change, Looks good other-wise.
ayushdg
left a comment
There was a problem hiding this comment.
Thanks for this @sarahyurick. Changes generally look good to me.
In addition to some of the integration tests added here which check the interaction of sql with our wrappers, it might make sense (in a followup) to also add unit tests that independently test the wrappers functionality outside of sql and make sure things are working as expected.
We can probably adapt the tests from https://github.com/dask/dask-ml/blob/94e52613fc87abd4ef8c97510539ded1303ae6f4/tests/test_parallel_post_fit.py and add it to the tests/unit section
As part of our initiative to move away from Dask-ML, I've migrated some code from Dask-ML into Dask-SQL to support ParallelPostFit.