-
Notifications
You must be signed in to change notification settings - Fork 65
Fix index clear #463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix index clear #463
Conversation
Before: the `.clear()` algorithm is wrong:
1. Suppose they are 2200 documents. The first 500 are deleted:
0 1000 2000
| | |
xxxxx-----------------
| |
| index + limit
index
2. There are now 1700 documents, and the next pagination query wrongly
skips the first 500 documents, and remove from 500 to 1000:
0 1000
| |
-----xxxxx-------
| |
| index + limit
index
3. There are now 1200 documents, and the next pagination query wrongly
skips the first 1000 documents, and remove from 1000 to 1200:
0 1000
| |
----------xx
=> Only 1200 of the 2200 documents are remove, and 1000 documents remain.
After: 500 documents are deleted in loop until there is nothing to
delete (with a security condition on the initial documents number, in
case of concurrent insertions).
31a7c6f to
ca62681
Compare
|
Thanks for this fix, @dfroger! You've been doing great work on redis-vl-python. The approach looks solid - switching from offset-based pagination to always-query-from-zero makes sense for a destructive operation like A couple things that would seal the deal:
Once those are addressed, this should be good to go! 🚀 |
|
Thanks @abrookins for the feedback! I think I could fix the 3 points at the beginning of next week. Looking forward! |
correct fix (in redis repo) would be for Query.paging() to return Self, not "Query"
|
@abrookins LGTM for the 3 points! But now sure if the CI has run the unit tests? |
fixes #460