Opened 12 years ago
Closed 10 years ago
#10942 closed enhancement (fixed)
Improvements for plugin's db schema
Reported by: | Steffen Hoffmann | Owned by: | Steffen Hoffmann |
---|---|---|---|
Priority: | normal | Component: | VotePlugin |
Severity: | normal | Keywords: | schema upgrade db |
Cc: | Trac Release: |
Description
In addition to issues with the db code itself (#10706) I feel that the db schema is flawed too. It should definitely follow Trac's resource concept more closely and store (resource) 'realm' and 'resource_id' of voted resources. This would help to avoid string functions in queries using more portable SQL instead (#4546). And while at it lets track the vote time (stamp) too, because there might be a use case mentioned in #7592.
So how about making a fresh schema version 2 part of the aforementioned db rework then? Count me in, if you need assistance, code review or in-depended testing.
Attachments (0)
Change History (17)
comment:1 Changed 12 years ago by
Type: | defect → enhancement |
---|
comment:2 Changed 12 years ago by
comment:3 follow-up: 6 Changed 12 years ago by
I'm preparing changes to the db schema now.
What I'm worried about is, that these changes are not easy to revert once done. So this is my preliminary schema v2 for discussion:
schema = [ Table('votes', key=('realm', 'resource_id', 'username', 'vote'))[ Column('realm'), Column('resource_id'), Column('username'), Column('vote', 'int'), Column('time', type='int64'), Column('changetime', type='int64'), ] ]
Questions:
- Does the 'vote' belong into primary key? Possible reason: Enforce one vote/user, but this is still not guaranteed now, because in theory there could be two entries, one with vote=-1 and another with vote=1.
- Do we really need to have POSIX microseconds time stamps? It could ease queries with joins on Trac core tables for Trac>=0.12, but apart from that the extra precision seems irrelevant for this plugin.
- Do you see the value of changetime too? I plan to not delete votes once submitted, and use a third vote value '0' instead. Therefor one would be able to see unaltered vs. changed votes. OTOH we'll still not track changes, and I see no reason to do so in the future.
Btw, despite talking about stuff mostly relevant for Trac>=0.12 here, I'll take care for backwards-compatibility down to Trac 0.11 as usual.
comment:4 Changed 12 years ago by
Owner: | changed from Ryan J Ollos to Steffen Hoffmann |
---|
comment:5 follow-ups: 7 11 Changed 12 years ago by
Here is my current thinking regarding development of the plugin:
- Let's plan for the next version with feature enhancements to be 0.2.0, but feel free to bump the 0.1.x line if you fix defects.
- For 0.2.0, I'm planning to tackle #4546 and #7592.
- It might be a good time to start a
changelog
and back-fill it as we go. - There was a recent comment on the trac-users mailing list that the
SqlQueryForReport.txt
wasn't working, so we should take a look at that.
While reviewing the code this weekend I noticed 2 issues:
voteable_paths
supports globs, but a regex is used for the path match. jun66j5 has previously noted that this might not be a good thing to do (#10226).- We should use
db.like()
anddb.like_escape()
(per t:TracDev/DatabaseApi#GuidelinesforSQLStatements) for the SQL query inget_total_vote_count
.
comment:6 follow-up: 8 Changed 12 years ago by
Replying to hasienda:
I'm preparing changes to the db schema now.
What I'm worried about is, that these changes are not easy to revert once done. So this is my preliminary schema v2 for discussion:
The only additional column I could think of having, and I'm not sure it is even relevant, but some resources have a version
. I think that we don't have to worry about this, but we could capture the version of the resource that was voted on.
Questions:
- Does the 'vote' belong into primary key? Possible reason: Enforce one vote/user, but this is still not guaranteed now, because in theory there could be two entries, one with vote=-1 and another with vote=1.
- Do we really need to have POSIX microseconds time stamps? It could ease queries with joins on Trac core tables for Trac>=0.12, but apart from that the extra precision seems irrelevant for this plugin.
- Do you see the value of changetime too? I plan to not delete votes once submitted, and use a third vote value '0' instead. Therefor one would be able to see unaltered vs. changed votes. OTOH we'll still not track changes, and I see no reason to do so in the future.
I have very little experience defining schemas, so I don't think that I'm the best person to answer these questions. You may even want to bring this up on the tracdev mailing list like you did for the timestamp issue. FWIW, the schema looks good to me, and I'd leave vote
out of the primary key and use microsecond timestamps!
I do like that you've added the changetime, and I think it will add some value that we might not even see yet.
You may want to keep in mind my proposal to merge the features of the FiveStarVotePlugin (#7615). I can't see that it will drive any schema changes though.
Btw, despite talking about stuff mostly relevant for Trac>=0.12 here, I'll take care for backwards-compatibility down to Trac 0.11 as usual.
It's up to you, but I'd be okay with having the 0.1.x release line be 0.11-compatible, making the schema change for 0.2.0 and dropping supporting for 0.11.x in VotePlugin 0.2.0.
Btw, feel free to set yourself as co-maintainer on the project wiki page. You are doing a lot of work here!
comment:7 Changed 12 years ago by
Replying to rjollos:
Here is my current thinking regarding development of the plugin:
As I revealed to you minutes ago, there's already some code to make all that happen on my side. Anyway, glad that we agree on this without speaking about it.
- It might be a good time to start a
changelog
and back-fill it as we go.
You seem to like my habit from maintenance of other plugins. So may it be. I've already thought about it too, and will introduce a changelog file with one of the upcoming changes.
- There was a recent comment on the trac-users mailing list that the
SqlQueryForReport.txt
wasn't working, so we should take a look at that.
It'll have to be redefined after the schema upgrade, but luckily I've already prepared one here, so no extra work needed on your side. I'll just update that file and include it in at /contrib
directory on commit time of the new db schema.
While reviewing the code this weekend I noticed 2 issues:
Didn't see that, but already had some strange feelings about that part of the code. I'll have to be reworked for making the include/exclude request happen one way or another (#7610).
- We should use
db.like()
anddb.like_escape()
(per t:TracDev/DatabaseApi#GuidelinesforSQLStatements) for the SQL query inget_total_vote_count
.
Yes and no. You'll see, that these statements will even get obsoleted by moving to more appropriate resource references in the schema v2.
comment:8 follow-up: 9 Changed 12 years ago by
Replying to rjollos:
Replying to hasienda:
I'm preparing changes to the db schema now.
What I'm worried about is, that these changes are not easy to revert once done. So this is my preliminary schema v2 for discussion:
The only additional column I could think of having, and I'm not sure it is even relevant, but some resources have a
version
. I think that we don't have to worry about this, but we could capture the version of the resource that was voted on.
This is an excellent idea. My thinking about a version column stopped after deciding, that a vote wouldn't need a version for itself. While I'm not sure, how to represent votes for previous versions in a minimalistic form right-away, this makes the move to proper resource references complete, sure. I'll implement that in my draft for schema v2 for sure.
Related question: For setting a version on schema upgrade I'd like to use the current (== latest) version, because "silent takeover" is what would happen for votes on any resource update today as well. Do you feel differently?
Note that the meaning of version differs by realm, even for the currently most relevant Trac core realm 'ticket' and 'wiki'. While a new wiki page version could totally change page content, ticket content doesn't change that much by version. It would be sensible to mostly look for changes of summary and description, when deciding on relevance for previous votes.
Questions:
- Does the 'vote' belong into primary key? Possible reason: Enforce one vote/user, but this is still not guaranteed now, because in theory there could be two entries, one with vote=-1 and another with vote=1.
- Do we really need to have POSIX microseconds time stamps? It could ease queries with joins on Trac core tables for Trac>=0.12, but apart from that the extra precision seems irrelevant for this plugin.
- Do you see the value of changetime too? I plan to not delete votes once submitted, and use a third vote value '0' instead. Therefor one would be able to see unaltered vs. changed votes. OTOH we'll still not track changes, and I see no reason to do so in the future.
I have very little experience defining schemas, so I don't think that I'm the best person to answer these questions. You may even want to bring this up on the tracdev mailing list like you did for the timestamp issue. FWIW, the schema looks good to me, and I'd leave
vote
out of the primary key and use microsecond timestamps!
Ok.
I do like that you've added the changetime, and I think it will add some value that we might not even see yet.
I like this notion. We don't loose much, but could miss a chance for upstream development indeed.
You may want to keep in mind my proposal to merge the features of the FiveStarVotePlugin (#7615). I can't see that it will drive any schema changes though.
While I do agree on the merge plan, this might require an additional column to sanely mark the type of vote done for any give resource. Thinking some more into that direction it would even make that 'vote_type' (preliminary working title) a candidate for inclusion into the primary key for schema v2.
Btw, did I mention, that I like collaboration? It's for the extra spin gained by sharing ideas. Thanks for that again.
Btw, despite talking about stuff mostly relevant for Trac>=0.12 here, I'll take care for backwards-compatibility down to Trac 0.11 as usual.
It's up to you, but I'd be okay with having the 0.1.x release line be 0.11-compatible, making the schema change for 0.2.0 and dropping supporting for 0.11.x in VotePlugin 0.2.0.
Btw, feel free to set yourself as co-maintainer on the project wiki page. You are doing a lot of work here!
Thanks. I'll introduce myself to the wiki page later on, and I'll try out, if it's possible to keep both of us informed about new incoming tickets too.
comment:9 follow-up: 10 Changed 12 years ago by
Replying to hasienda:
Related question: For setting a version on schema upgrade I'd like to use the current (== latest) version, because "silent takeover" is what would happen for votes on any resource update today as well. Do you feel differently?
I think that probably makes sense, but ultimately it depends on how we utilize version
. If we intend to record the version
of the resource at which the vote was last changed (corresponding to changetime
), then I think this definitely makes sense.
The alternative would be to leave the version NULL
on schema upgrade. We'd have to take extra care in the code and in queries for the NULL
value, but we should be doing that anyway, so it's really just keeping us honest. The advantage is that there would be no ambiguity between votes that occurred pre and post schema upgrade. So far though, I see no harm to this ambiguity.
Btw, did I mention, that I like collaboration? It's for the extra spin gained by sharing ideas. Thanks for that again.
Oh, same here. Working alone, I will learn some. By collaborating, I learn much more, and am more likely to retain what I learned! Besides that, I think we have the opportunity to develop plugins more quickly by collaborating, and just speaking for myself, I'm more likely to stay motivated and less likely to lose focus.
comment:10 Changed 12 years ago by
Replying to rjollos:
Replying to hasienda:
Related question: For setting a version on schema upgrade I'd like to use the current (== latest) version, because "silent takeover" is what would happen for votes on any resource update today as well. Do you feel differently?
I think that probably makes sense, but ultimately it depends on how we utilize
version
. If we intend to record theversion
of the resource at which the vote was last changed (corresponding tochangetime
), then I think this definitely makes sense.
Yes, this is what I intend to do indeed.
The alternative would be to leave the version
NULL
on schema upgrade. We'd have to take extra care in the code and in queries for theNULL
value, but we should be doing that anyway, so it's really just keeping us honest. The advantage is that there would be no ambiguity between votes that occurred pre and post schema upgrade. So far though, I see no harm to this ambiguity.
Your argument for keeping ambiguous votes like that until next vote update sounds reasonable. I'll remove 'set to latest', that has been implemented for the upgrade script before.
comment:11 Changed 12 years ago by
comment:12 Changed 12 years ago by
I finally had a chance to review your patches sent on March 28th, and they look good to me. I've been looking closely at the BookmarkPlugin lately, in order improve the rendering of bookmarks (#9212) and also fix some exceptions that can occur when certain resources are bookmarked. Your patches have given me a lot of suggestions on how to improve the code for BookmarkPlugin as well.
I have the following thoughts regarding the patches:
- You alerted me to
resource_exists
not existing prior for Trac <= 0.11.7, which I really appreciated (comment:20:ticket:9785). I will get that issue fixed for the BookmarkPlugin. - I'm not sure it is applicable to VotePlugin, but I had to take a lot of care in [13029] to deal with attachments (relevant code starts at line 263 in
__init__.py
). A bookmark in the attachment realm may or may not have a resourceid
(for example, the Attachments List page/attachment/ticket/1
would not have a resourceid
), and it can be difficult to extract the realm, resource id, parent realm and parent id. Consider/attachment/wiki/Page/SubPage/SubSubPage/filename
and/attachment/wiki/Page/SubPage/SubSubPage
can both be bookmarked. Both may have an arbitrarily long parent resourceid
, and in the latter case there is no resourceid
. These complications had me considering whether it might be better to also save the parent realm and id in the table, though I'm unsure of this so far. - I see you've taken care to add a lot of compatibility code. I commend you for this, but honestly, when it comes to fixing the same issues with the BookmarkPlugin, I will likely drop support for Trac < 1.0 before making the changes to the DB schema.
- Really nice that you've take care for resource renaming. I need to do the same for the BookmarkPlugin, caring for at least milestones as well. For the VotePlugin, supporting milestones is probably not so important.
I do more testing and review as you push the changes. I hope you don't mind that I CC you on similar work for the BookmarkPlugin. I see a lot of opportunity here for cross-plugin review.
comment:13 Changed 12 years ago by
(In [13079]) VotePlugin: Moving to new db schema, refs #4546 and #10942.
Major changes are
- conversion of internal resource pointers from path to resource identifiers
- schema version entry in Trac db table
system
- two time stamp columns for both, initial vote and last change/update
Introduce slightly reduced version of common schema upgrade code and unit tests covering possible install/upgrade scenarios. Time stamp columns are prerequisite for something like vote history. This might be available later on as core feature or using a wiki macro.
comment:14 Changed 12 years ago by
It might not look so, but this is the near-minimum of changes to get to revision-tagged db schema. Especially subtle are required changes for packaging to include upgrade code into Python packages.
Despite of this rather big table extension the discussed merge with FiveStarVotePlugin will require yet another column for vote type (working title). But I heartily agree to not hurry for that and save it for next schema revision instead, if it continues to look like a good idea after some time.
comment:15 Changed 12 years ago by
As side-effect of internally moving to resources in [13079] is, that previously vote-able realms need to be re-checked now. Today I did that for milestones and will commit required changes in a few minutes.
Thanks to Trac's default milestone names I discovered an insanity in resource_from_path
during that development work. At first sight it came as a real surprise, that resource IDs could get accidentally altered, if they contain their realm name: 'milestone/milestone1' --> '1' instead of 'milestone1'.
The regular expression function re.sub()
must be limited to one replacement at maximum. Accidental discovery, small change, big gain. Imagine, what nasty bug reports this could have spawn, if undetected.
comment:16 Changed 12 years ago by
(In [13093]) VotePlugin: Re-enable voting on milestones, refs #4546, #7592 and #10942.
delete_vote
obviously needed to not depend on a req
object to work.
That went unnoticed, because it is called only by change listeners for now.
set_vote
was not ready for unversioned resources, and a subtle insanity in
resource_from_path
surfaced during early testing with milestones too.
Method name changes are done for clarity, that a single method call probably deals with multiple vote entries.
comment:17 Changed 10 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
Sounds like good ideas to me. Feel free to push ahead if you have time, and I will try to keep involved and perhaps even find time to move this forward myself.