-
Notifications
You must be signed in to change notification settings - Fork 172
THREESCALE-12133 optimize configuration reload #1564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
THREESCALE-12133 optimize configuration reload #1564
Conversation
Previously, every time the policy chain was rebuilt, the gateway would look for the manifests file on the disk. This caused a delay for incoming requests. Since the manifests are static and do not change at runtime, it is more efficient to cache them at the module level. This caching will speed up the lookup process.
Creating a json schema validator is somewhat expensive, so we cache this step with a local LRU cache
| if manifest then | ||
| return load_path, manifest.configuration | ||
| else | ||
| insert(failures, load_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading lua for the first time so excuse me if I miss something.
Whenever the manifest is not found, here we insert a failure somewhere. But if it is missing, should we continue with the rest of the method? Shouldn't we just return to avoid further processing?
Perhaps this would not be performance critical because we don't expect errors to happen often. But it occurred to me to ask you if that makes sense.
| ):check(self) | ||
| if manifests then | ||
| PolicyOrderChecker.new(manifests):check(self) | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nit, it looks like the method is only called in gateway/src/apicast/configuration.lua without parameters. So perhaps makes sense to simplify by removing the argument from it as well the if clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably the idea was to use the manifest cache somewhere?
In any case - why is this change implemented? 🤔
It seems to me that:
- It always loads all policies (on line :196) regardless of whether
manifestsis passed or not. - If
manifestsis passed, and is not null (as @akostadinov pointed out - this does not happen at this point), then the policy order check will be performed 2 times. Is that intended?
akostadinov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are some excellent improvements! Just a couple of minor suggestions that you can ignore if they don't make sense to you.
Now learning a little bit more about apicast architecture, I think that your suggestion of looking at increment updates makes a lot of sense so we can discuss the options to implement.
| local tab_new = require('table.new') | ||
| local isempty = require('table.isempty') | ||
|
|
||
| local manifests_cache = tab_new(32, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 32? A question without understanding (yet, at least) how the whole thing works.
We have about 50 policies, which also might be growing over time. I understand that a single customer will probably not use all of them, but is there a big difference in reserving a bit more (just in case) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, reading further - I can see that when all policies are loaded, this cache is being replaced by the dictionary (or hash, or whatever it's called in Lua).
But then - shouldn't we allocate hash slots instead? tab_new(0, 32) or also potentially bump 32 to a bigger number?
Maybe it's nitpicking, not sure.
| return format('%s/?.lua', load_path) | ||
| end | ||
|
|
||
| local function get_manifest(name, version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things:
- Maybe name this method as
get_cached_manifestor something like that, so the intention is clearer, as there are quite a few with similar names:get_manifest,load_manifest,read_manifest, it's not always clear what they do exactly. - I was thinking if it could make sense to have
name:versionas the key of the cache so we can avoid the loop. But well, I guess the idea is for cache to match exactly the structure of themanifestsobject that is loaded from filesystem, so it's easier to operate. There probably won't be many versions of the same policy anyway, so probably premature optimization.
| local service = configuration.parse_service(proxy_config.content) | ||
|
|
||
| -- We always assign a oidc to the service, even an empty one with the | ||
| -- service_id, if not on APICAST_SERVICES_LIST will fail on filtering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need to leave this comment to explain the reason?
|
|
||
| -- We always assign a oidc to the service, even an empty one with the | ||
| -- service_id, if not on APICAST_SERVICES_LIST will fail on filtering | ||
| local oidc = self:oidc_issuer_configuration(service) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the oidc_issuer_configuration function is not called from anywhere anymore, so probably can be deleted?
|
I haven't actually built/run the image, but I left some comments 🙃 |
What
Fix THREESCALE-12133
NOTE:
Before:
File read
Latency
Flamegraph
After:
File read
Latency
Flamegraph
Verification steps
Details
docker-compose.ymlfile as follow. Update<token>and<3scale_admin_portal>to match values from previous step.Replace:
APICast_gateway_IP,user_keywith value from previous stepsVisit the admin portal and replace the
hostnamewith the hostname of one product of your choiceStart k6 test