Modify

Opened 14 years ago

Closed 14 years ago

#8487 closed defect (fixed)

[patch] AcctMgr creates blank lines in password_file under Windows

Reported by: Jeremy Dunn Owned by: Steffen Hoffmann
Priority: high Component: AccountManagerPlugin
Severity: major Keywords: EOL style Windows
Cc: Trac Release: 0.12

Description (last modified by Ryan J Ollos)

Every time the AcctMgr plugin accesses the configured Password_file, it adds an extra blank line after each line in the file. First save, file doubles in size, then triples, etc. I have only 73 users, but the file was 125MB. Eventually trac runs out of memory when trying to do any operation.

AcctMgr plugin 0.3dev r9591 
Trac v0.12.  Python v2.5.  Windows 2008 R2
Trac.ini:
[account-manager]
account_changes_notify_addresses = <removed>
force_passwd_change = true
htdigest_realm = trac
notify_actions = new
password_file = c:\TracRoot\tracusers.txt
password_store = HtDigestStore
user_lock_max_time = 0

I had every feature enabled except registration. For the moment I've disabled everything, which is why the lines are commented-out:

[components]
#acct_mgr.admin.* = enabled
#acct_mgr.admin.accountmanageradminpage = enabled
#acct_mgr.api.* = enabled
#acct_mgr.api.accountmanager = enabled
#acct_mgr.db.* = enabled
#acct_mgr.db.sessionstore = disabled
#acct_mgr.guard.accountguard = enabled
#acct_mgr.htfile.* = enabled
#acct_mgr.http.* = enabled
#acct_mgr.notification.* = enabled
#acct_mgr.pwhash.* = enabled
#acct_mgr.svnserve.* = enabled
#acct_mgr.web_ui.* = enabled
#acct_mgr.web_ui.accountmodule = enabled
#acct_mgr.web_ui.loginmodule = enabled
#acct_mgr.web_ui.registrationmodule = disabled

Attachments (2)

fx_8487_eol-handling_on_univ-newline-supp_miss.patch (1.3 KB) - added by Steffen Hoffmann 14 years ago.
preview on possible fix based on current trunk code
fx_8487_eol-handling_reworked_more.patch (3.6 KB) - added by Steffen Hoffmann 14 years ago.
a different approach, largely dumping Universal Newline Support for reading password files, that should be written back - with syntax fix

Download all attachments as: .zip

Change History (30)

comment:1 Changed 14 years ago by anonymous

this might be related to [9272] since it mentions EOL values. I suspect the issue is something different between *nix and Windows. I'm a programmer but not a Python programmer so I'm sorry I cannot contribute a solution.

comment:3 Changed 14 years ago by Ryan J Ollos

Description: modified (diff)

comment:4 Changed 14 years ago by Steffen Hoffmann

Sorry to see that issue, and I suspect that changeset too.

It was clearly bad not to have a Windows test bed for these changes. Glad at least you did test and notice it by now. With *nix systems there has been no report by now, and I saw no problems in several Trac application I maintain myself. So this is clearly related to the special line endings used on Windows systems.

Just for clarification: "Every time the AcctMgr plugin accesses ..." means a write (user add/delete, password update), right? A read operation should really change nothing at all. Could you confirm, please?

I'll have to digg much deeper into the aforementioned changeset [9272]. Looks not too obvious, why just Windows systems should do wrong, since we always add line endings, but now we care about append different endings depending on existing line ending style.

comment:5 Changed 14 years ago by Jeremy Dunn

thanks for your attention to this ticket.

Just for clarification: "Every time the AcctMgr plugin accesses ..." means a write (user add/delete, password update)...

yes, that is correct. when any user changes their own password, or a privileged user changes another user's password, etc.

The odd thing is the file grows *very* fast. As I wrote in the ticket, the password file has 77 actual users. When I found the problem, the file had 42,535,673 lines !! <not> kidding. It was 127,611,031 bytes. I still have the file if we need to analyze it.

It seems hard for me to believe that this was caused only by users. Somehow it must be multiplying the number of blank lines, or similar.

I grep'd out the non-blank lines and created a new password file; and then tried the AcctMgr plugin again. Each time I changed a password, it added just one blank line per user in the file. This doesn't add up to 42M lines !

Anyway the problem is easy to recreate on my system so if you want an example or want me to test a patch I'd be happy to. Unfortunately as I said I'm not a Python programmer. Also it appears the person who installed the AcctMgr plugin may have put it in her home directory, and she's not available for a few weeks more. I'm an admin on the system but inexperienced with Python, so if there's a patch I'll need instructions to install.

comment:6 Changed 14 years ago by Jeremy Dunn

one more thing: a little math shows that 3 * 42,535,673 = 127,607,019, which is 4,012 bytes less than the actual file size. 4,012 / 77 = ~52 bytes/ user, which seems about right. This means that each blank line had -3- characters. Not just \r\n, but something else as well. But not a blank space. grep -c " " <file> yielded only 2 lines. Not sure what it is. Very hard to deal with a file that large in any text editor, and I don't have good *nix tools on this particular system. (no cygwin)

comment:7 Changed 14 years ago by Jeremy Dunn

Aha. I opened the 127M file with a hex editor. The sequence of bytes for each blank line is x0d x0d x0a, i.e. \r\r\n

Perhaps this gives a clue as to what has gone wrong?

comment:8 Changed 14 years ago by Steffen Hoffmann

I suddenly recognized today, that #8422 might be related if not a duplicate of this ticket or vice versa.

comment:9 Changed 14 years ago by Jeremy Dunn

Yes, it seems to be the same.

Also I realized that the file is growing exponentially because each time it adds \r\r\n. Then next time, every \r is turned into another \r\r\n, etc.

comment:10 in reply to:  9 Changed 14 years ago by Steffen Hoffmann

Status: newassigned

Replying to jeremy.j.dunn@gmail.com:

Yes, it seems to be the same.

Also I realized that the file is growing exponentially because each time it adds \r\r\n. Then next time, every \r is turned into another \r\r\n, etc.

There are two places in current code after [9272] that may cause this, if handling is different in Windows that what I had expected from all Python environments. I'll prepare a debug patch, to enable you to help me moving from a guess to facts soon. Thank you for your patience with this nasty bug. This must be fixed ASAP.

comment:11 Changed 14 years ago by Steffen Hoffmann

Keywords: EOL style added
Priority: normalhigh

I meant this, really soon.

comment:12 Changed 14 years ago by Jeremy Dunn

Any status on a fix? I'm happy to test patches on Windows 2008 R2.

comment:13 Changed 14 years ago by Jeremy Dunn

Ok - as I wrote before I'm not a Python programmer, but I think I've found the problem:

in htfile.py, lines 122-127

   elif line.endswith('\n'): 
      if eol == '\n': 
         new_lines.append(line) 
      else: 
         # restore eol 
 	 new_lines.append(line.rstrip('\n') + eol)

For Windows, all lines end with \r\n, so the first elif is true. Then, eol is NOT \n, due to lines 103-106, which set eol to \r\n for Windows. So we drop thru to the 'else' clause. Strip the \n (leaving \r) and add eol (\r\n).

NET RESULT: every \r\n gets turned into \r\r\n, which is exactly the observed behavior.

comment:14 Changed 14 years ago by Steffen Hoffmann

Sure, it is about those lines. But my question is more about, why opening your password file doesn't happen as asked - with universal newline support [1], you see? Current code has been done not without much thought, especially when it comes to the lines in question.

But obviously it seems like your Python is compiled without that highly useful functionality or not using it for other reasons. I've been unable to find existing evidence, that this is a known bug in any specific Windows version.

Anyway, it has become clear by now, that we shouldn't totally rely on working universal newline support but make sure that we get what we expect. So let's add a small test section and act accordingly.

[1] http://docs.python.org/release/2.3.5/whatsnew/node7.html

Changed 14 years ago by Steffen Hoffmann

preview on possible fix based on current trunk code

comment:15 Changed 14 years ago by Steffen Hoffmann

Summary: AcctMgr creates blank lines in password_file under Windows[patch] AcctMgr creates blank lines in password_file under Windows

It's too late to test the patched version here tonight, but please report back any problems you may still encounter after applying my patch.

Thanks for the energy you put into testing development code, bug research, gently kicking me towards a solution and last but not least your patience.

comment:16 Changed 14 years ago by Jeremy Dunn

Thanks @hasienda. Likewise I appreciate your support. Where is "here"? I am in Massachusetts/USA, GMT-5.

Unfortunately I do not know how to apply this patch. As I wrote earlier someone else did the install; and things are a bit confused on our development server; and I am not a Python programmer.

We are running the Windows build of Subversion distributed by CollabNet. It's in c:/csvn/. Python is in c:/csvn/Python25/. Trac appears to be in c:/csvn/Python25/Lib/site-packages/. The only reference to AcctMgr is c:/csvn/Python25/Lib/site-packages/tracaccountmanager-0.3dev_r9591-py2.5.egg. But the .egg does not seem to be unpacked anywhere: htfile.py is nowhere in the c:/csvn/ path.

I have reached out to some colleagues with more Python experience. In a few days perhaps I can send a reply if the patch worked or not.

comment:17 Changed 14 years ago by Jeremy Dunn

Hi again. I figured out how to install the patch. Strangely, it is -not- working, and I don't understand why.

I added two sets of debug logic. First to determine eol format:

            eol = '\n'
            f = open(filename, 'rU')
            lines = f.readlines()
            newlines = f.newlines
            if newlines is not None:
                if isinstance(newlines, str):
                    newlines = [f.newlines]
                elif isinstance(newlines, tuple):
                    newlines = list(f.newlines)
                if '\n' not in newlines:
                    if '\r\n' in newlines:
                        # Windows newline style
                        eol = '\r\n'
                        new_lines.append('Windows format')
                    elif '\r' in newlines:
                        # MacOS newline style
                        eol = '\r'
                        new_lines.append('MacOS format')

Second to determine which if statement is executed, using my initials 'jjd':

            if len(lines) > 0:
                for line in lines:
                    if line.startswith(prefix):
                        if not matched and userline:
                            new_lines.append(userline + 'jjd0' + eol)
                        matched = True
                    # preserve existing eol and 
                    # handle missing universal newline support gracefully too
                    elif line.endswith(eol) or line.endswith('\r') or \
                            line.endswith('\r\n'): 
                        new_lines.append(line + 'jjd1')
                    # unify eol style using universal newline support	
                    elif line.endswith('\n'):					
                        new_lines.append(line.rstrip('\n') + 'jjd2' + eol )
                    # make sure the (last) line has a newline anyway					
                    else:
                        new_lines.append(line +'jjd3' + eol)

After making one user change with this version of the code, the TracUsers.txt file has 'windows format' at the top, then 'jjd2' plus \r\r\n at the END of each line.

known facts:

  • initial state: each line ends with \r\n only (verified in hex editor)
  • eol is \r\n (verified by "windows format")
  • case 'jjd2' is taken, which implies that Universal Newline Support is working properly, else it would take case 'jjd1'
  • the \r\r\n is AFTER 'jjd2' (verified in hex editor)

I don't understand how

new_lines.append(line.rstrip('\n') + 'jjd2' + eol )

can result in <line>jjd2<\r\r\n>, when eol=\r\n, unless there is something strange with append() or with writelines()

comment:18 Changed 14 years ago by Jeremy Dunn

Thinking about this with fresh eyes:

  1. if Universal Newline Support were in fact working, then 'newlines' should contain only \n, since it is the result of readlines(). per PEP278
     Conversion of newlines happens in all calls that read data: read(),
        readline(), readlines(), etc.
    
  1. Presuming UNS is -not- working, then lines[] contains strings ending with \r\n. This matches the observed behavior, where "windows format" is output after newlines contains \r\n
  1. Presuming UNS is -not- working then, case 'jjd2' is removing the \n from \r\n, and adding \r\n, leaving \r\r\n. This is the observed behavior.
  1. If the above are true, then the only thing I don't understand is why isn't this case taken?
    elif line.endswith(eol) or line.endswith('\r') or \
                                line.endswith('\r\n'): 
    

comment:19 Changed 14 years ago by Jeremy Dunn

Oh drat - never mind the above comment. It's obvious we have UNS because newlines is not None. newlines only exists with UNS. Also if my above comment were true, we'd get <line>\rJJD2\r\n, which is not the observed behavior.

So I'm back to my prior comment, not understanding how the extra \r is added.

comment:20 in reply to:  19 Changed 14 years ago by Steffen Hoffmann

Replying to jeremydunn:

![...] So I'm back to my prior comment, not understanding how the extra \r is added.

You're not alone. Thanks again for your continued debugging. Could not dream of a better help with this issue. Awesome. And frightening, as it contradicts my logic too. Anyway, out of curiosity, would you please test with the following lines replaced:

-                    elif line.endswith('\n'):
-                        new_lines.append(line.rstrip('\n') + 'jjd2' + eol
+                    elif line.endswith('\n'):
+                        new_lines.append(line.rstrip('\n').rstrip('\r') + 'jjd2' + eol

As this is a generic strip working for any eol style, it should work regardless of the test result.

comment:21 Changed 14 years ago by Steffen Hoffmann

Thinking it twice, there really shouldn't be a change from the previous micro-patch. I rule out a glitch with the append() method to the new_lines list as well, so I'll have a look at the writelines() part now.

comment:22 Changed 14 years ago by Jeremy Dunn

tried the newest patch, but there is no change as anticipated in your last comment. still getting \r\r\n at the end of each line, AFTER whatever additional text I add for debugging.

comment:23 Changed 14 years ago by Steffen Hoffmann

Don't know why I didn't search before, but there is some known anomaly about Windows handling file writes [1].

So we may want to do a 'binary' write:

         try:
-            f = open(filename, 'w') 
+            f = open(filename, 'wb') 
             f.writelines(new_lines)

or ditch explicit Universal Newline Support as it appears to be used by Windows under the hood in such an unintuitive way. I tend towards using the 'generic strip method' from comment 20 as a new private method and just pass everything else through as it is. And maybe use os.linesep, if we encounter eol style when actually processing lines without any eol character as a last resort. Well, now I'm sitting here to decide, how to go on, and I'm not finished with this yet. Only I hope we can go away with better cross-os behavior without writing (more) Windows-specific code.

If you feel adventurous, there's another aspect to check from the aforementioned discussion: 'symmetrical' file open arguments. So it could be interesting to know, if the following small patch does anything good too:

         try:
-            f = open(filename, 'w') 
+            f = open(filename, 'Uw') 
             f.writelines(new_lines)

I've seen this before, but don't know, if it works to change the Windows eol handling mess. But I may not require to know or rely on your findings for the fix to be made, so feel free to test this second thing or not.

[1] http://bytes.com/topic/python/answers/627424-file-object-behavior

Changed 14 years ago by Steffen Hoffmann

a different approach, largely dumping Universal Newline Support for reading password files, that should be written back - with syntax fix

comment:25 Changed 14 years ago by Jeremy Dunn

@hacienda: Thank you! The new patch is working properly.

I tested add user, delete user, change password. All operations leave the trac user file with correct (Windows \r\n) line endings and no extra lines. For verification, my system is Trac 0.12, Python 2.5, Windows 2008 R2.

From my POV the bug is fixed; but of course it would make sense to regress it on other OS; other versions of Windows; and other versions of Python before closing.

comment:26 Changed 14 years ago by Steffen Hoffmann

I've enjoyed our cooperation. And certainly this helped a lot to re-think what I thought was a perfect solution some months ago, and no-one else reported a problem until now.

I've tested with Trac 0.13dev, Python2.6 on Debian GNU/Linux Squeeze (6.0), so ideally we would get reports for Trac 0.11.x, Python2.4 and maybe even some older MacOS for a full cover. But I'll just commit the changes now, so this will get tested by other adventurous users within the following weeks.

comment:26 Changed 14 years ago by Steffen Hoffmann

(In [9923]) AccountManagerPlugin: Fix password file handling on Windows systems, refs #8487.

This is a urgent follow-up to my changes introduced in changeset [9272]. It has been possible to rework this after finally figuring out, that Universal Newline Support works not as expected when writing (back) files. Many thanks to Jeremy Dunn for initial report, debugging and testing. Any non-*nix user should upgrade to avoid painful exponential file growth. Sorry for any inconvenience you've suffered meanwhile.

comment:27 Changed 14 years ago by Steffen Hoffmann

No more complaints for more than three months - feels like we're done with it, right?

comment:28 Changed 14 years ago by Jeremy Dunn

Resolution: fixed
Status: assignedclosed

I'm happy with it. I just checked, and my tracusers.txt file remains correctly-formatted, without extra line breaks. Thanks again for your support.

  • Jeremy

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Steffen Hoffmann.
The resolution will be deleted. Next status will be 'reopened'.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.