Proposal:
It is difficult task. I think the best we can do it is use user_id in cache key and invalidate it when permission changes. Usually for this purposes, can be used so called cache group. And in same cases you can remove all cache keys inside certain group. I think it is the best solution.
Designs
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
Some brainstorming, I didn't think about the details yet:
Would it be possible to cache the notes based on the referenced projects?
That way we can use the cache for all users who can see those projects.
example:
a note contains only refs to public projects or doesn't contain links -> cache it for all users
a note contains only refs to internal projects or doesn't contain links -> cache it for all registered users
The tricky part about this proposal would be that the possible combinations of projects are endless.
We could set a limit here where we only cache notes that contain less than x different project visibilities and don't use caching for the rest.
That would probably allow us to cache 90% of the notes and we can share the cache between users.
@haynes Extend that to groups which can be public or private as well, and I think you're onto something. It's a bit complex, but not unreasonably so, and it will work which is what we're looking for :)
Always render all references regardless if the project/group exists and return 404 errors when not authenticated or the project/group doesn't exist.
Performance wise this is the optimal solution, because everybody will see the same note.
I'm not sure if that's the nicest thing from a usability perspective, but it's definitely the fastest and probably easiest solution.
Rendering @private-group as a link leaks group existence. Rendering public-group/private-project#123 as a link leaks project and issue existence, and leaks the issue title since it's in the HTML link's title attribute. As far as I can see, Markdown and authorization need to go hand in hand, at least to some extent.
Idea:
Render every link in Markdown, regardless of the current_user's access level, adding data-group_id and data-project_id to all rendered links. Cache this output HTML. Upon rendering, run one last pipeline processor over this cached HTML, that will verify access to the group and project IDs, and replace the links by their text content if the user doesn't have access.
This would allow caching of almost all of the overhead of Markdown rendering.
@DouweM No. In that proposal the goal is to render everything that looks like a reference regardless if goal of the link actually exists.
So render everything that looks like it might be a reference as such.
examples:
public-group/private-project#123 <-- project exists, but user doesn't have the permissions: render the link but return 404 on click
public-group/private-project2#123<-- project does not exist: render the link but return 404 on click
edit: whoops should've read more carefully. Yes the title would be a problem.
@DouweM I think what we really want is probably a combination of my first proposal and your idea.
I would imagine a filter that has to search through a long note to be slow.
Ideally we'd save the referenced projects/groups in a note to the note Model, so you only need to invoke the Filter when it contains links to private stuff.
For everything else you can use a cached version for notes that contain only public/no references and another one for notes that contain references to internal references.
I would imagine a filter that has to search through a long note to be slow.
@haynes The main problem was that running 20 different pipeline filters was slow. Running 1 is acceptable.
Ideally we'd save the referenced projects/groups in a note to the note Model, so you only need to invoke the Filter when it contains links to private stuff.
To know "when it contains links to private stuff", we'd need to check permissions for every project anyway, just like the filter would. We don't win much here, except making it much more complex by saving all references in the DB.
Also don't forget that project access levels and user's access to groups and projects can change.
@DouweM Yes. You'd need the check anyway. But if a note doesn't contain references to internal/private projects/groups, you don't need to parse the note at all.
Perhaps something much simpler like a boolean value (note contains references yes/no) would be enough to avoid using the filter unnecessarily.
Anyway. Let's implement your proposal first and if it's still slow for some notes we can think about mine again ;)
@DouweM@haynes
Taking into account @dzaporozhets's words about different content of notes for different users. I will repeat, he said it is a bad design approach. And I totally agree. My question is, why can't we use standard approach for entities which need to be prematurely parsed? I mean we can create column like note_parsed that will be filled out after note creation (and updated after update). And we don't need a cache at all.
PS I think we just need to show 404 or 402 errors for all pages the user has no access to.
Taking into account @dzaporozhets's words about different content of notes for different users. I will repeat, he said it is a bad design approach. And I totally agree.
@vsizov The reality is that right now, the content of different notes is different for different users, because the set of "existing groups and projects" is not the same to every user.
We could just render every x/y#123 as a link, regardless if x/y is an existing project, but that may cause confusion when it causes error 404 for a selection of people.
We could, as @rspeicher and I have proposed earlier, use a single pipeline filter to strip out those links that the user doesn't actually have access to. This greatly improves performance, but is still flexible around project access.
Storing the processed content in the DB is not a good idea since our Markdown parser and its output (including CSS classes etc) change over time.
We could, as @rspeicher and I have proposed earlier, use a single pipeline filter to strip out those links that the user doesn't actually have access to. This greatly improves performance, but is still flexible around project access.
@dzaporozhets I'm not sure about @DouweM, but I haven't done anything with it yet. I can get started with it if you think the additional filter is the right approach.
@DouweM Keeping in mind that this filter is run outside the normal pipeline on already-generated HTML, how do you think we should deal with figuring out what a link is a reference to? As far as I can tell, we can either:
Parse the href attribute to extract object and ID from the link
Add data-* attributes to the reference links.
To me, two seems like the better option. Is there one I'm missing?
@rspeicher I want you to do something with it I cant forgive comments and events being non-cached. And fact that we require current_user in deep of markdown rendering also does not add fun
This has proven more complicated than anticipated - see ongoing discussion in !1090 (merged) - and I'm not confident it can be ready (implemented, thoroughly tested) in time for 8.0, especially now that I'll be doing release manager stuff at the same time.