[ale] Fetchmail IMAP backup

Chris Fowler cfowler at outpostsentinel.com
Tue Jan 3 13:37:43 EST 2017


> From: "Steve Litt" <slitt at troubleshooters.com>
> To: ale at ale.org
> Sent: Tuesday, January 3, 2017 12:50:40 PM
> Subject: Re: [ale] Fetchmail IMAP backup

> On Tue, 3 Jan 2017 10:14:53 -0500 (EST)
> Chris Fowler <cfowler at outpostsentinel.com> wrote:

> > I'm a bit confused on what I can do with fetchmail with IMAP.


> > I want to connect to a remote IMAP server, fetch email, but not
> > delete them (keep). I want to do this all the time so that I can use
> > procmail on where fetchmail is running to search these emails, sort
> > them based on criteria.

> poll mail.bagelpatunias.com protocol IMAP:
> user 'slitt at bagelpatunias.com' there is 'slitt' here
> pass 'wouldnt_you_like_to_know'
> limit 50000000
> warnings 3200
> expunge 60
> mda "/usr/bin/procmail -d %T"
> fetchall
> ssl;

> To leave the mail on the IMAP server, eliminate the expunge command
> (try it to make sure, I've been wrong before).


> > You can't use fetchall and keep in daemon mode

> My experience is that fetchmail keeps running til you stop it.

It can be either. I know when I told it to keep and fetchall it ran, but had a nasty warning that "it will not work". I can understand why. 
I've been running fetchmail in daemon mode for each user for 10years+. I was a noob then and have not updated the config. One benefit is that for me I can poll every 60s. They every 5 minutes. I need to react to emails I receive. I'm in the process of redoing that server. My goal for the upgrade is to change to spamassassin as daemon, clamav as daemon, and one fetchmail for all. I'm still the middle man between our email and our Zimbra provider. I feed our provider email with a program I wrote that send it to them. That program needs an update to support spooling and throttling via a token bucket. They have disabled my account a few times due to the email sent to me being sent at a rapid pace due to our server going to 0% disk left. Fetchmail keeps going after the same stuff, dominoes fall, boom! I can stop this, but I'm lazy. :) 

> > and I can't figure out
> > how fetchmail knows what it has fetched in the past.

> I don't know either, but my guess would be that it marks the downloaded
> messages as "read". Which is a disadvantageous side effect.
It does not. If I stop and start it back it starts at 1 of 14,000. I am not sure if I can specify an epoch on the command line of fetchmail. I really don't require fetchmail to know. If I can tell it then I can do it. Programming gives me some options. Maybe Perl with IMAP client and I do myself? I don't know yet. I'm more apt to use programs that do a better job than I'd do. Especially those that have been updated and doing that job for a long time. Fetchmail is more of an expert at fetching IMAP than me. I know that it does not log message # or anything else. It passes it right to procmail 

Damn! I just saw this 

fetchmail: reading message XXXXXXXX:8159 of 14314 (709 header octets) (1936 body octets) not flushed 
fetchmail: Received BYE response from IMAP server: MAILBOX IS IN INCONSISTENT STATE, PLEASE RELOGIN.fetchmail: socket error while fetchin 

A start up puts be back to 1 of 14,314 

I do see this in man page 

-i <pathname> | --idfile <pathname> 
(Keyword: idfile) 
Specify an alternate name for the .fetchids file used to save message UIDs. NOTE: since fetchmail 6.3.0, write access to the directory containing the idfile is required, as fetchmail 
writes a temporary file and renames it into the place of the real idfile only if the temporary file has been written successfully. This avoids the truncation of idfiles when running 
out of disk space. 

I'm thinking what I can do is if that file does not exist do 'fetchall'. I'm testing now 

> > The mailbox
> > belongs to a user who needs his email and we need to automatically
> > sort incoming email that is sent to him.

> Huh? Sort? Sort how? To what end? Are you saying you want to move
> messages to various folders on the IMAP directory? Eeeuuuuu!

> Is this user so unreasonably stuck in his work procedures of the past
> that he's willing to make you jump through these kinds of hoops? Is
> this guy paying you?
No. He is not paying me. I would not follow him anyway. 

I'm not moving anything on IMAP. He has a lot of mail and I need him to do auditing of data where the accurate data exists in his INBOX. 
To make it easier I'm making a copy of his INBOX and am filtering all those emails we need. Since I can program too I have MANY options of making it even more easier. We are going after serial numbers. I can search for serials in each email and then add a copy to a file for that serial number. 
Some of these emails have a XLS attachment with many serials. Perl SpreadSheet::ReadExcel can allow me to grok those with regex. 

This helps us to audit, but it is also a great idea for now on. 

If I had to do this auditing myself I'd be doing what I just wrote above. No way I'm going through 14K emails. If I had to I'd hire my "butler" do copy them all to a folder first. I'd then think about editing an Excel sheet with the email, but end up hiring my butler to do that too. I'm not paying my butler to do this because if the person that needed this had done it a few years back when I told him too he'd be thankful today. It was not much work then. A lot of work now! 


> > My idea is to use fetchmail on a computer to attach to his IMAP as an
> > IMPA client, download, and process via procmail.

> HIS IMAP? Wait a minute. Does he have HIS OWN IMAP, under your control?
> Is his IMAP constructed with Dovecot? If so, I think I've already solved
> this problem. I thought you meant his IMAP was on Google Mail or
> something.
It is now. Last night while driving from NC to GA the gears in my head was working on this problem. I originally was going after Zimbra IMAP, but these emails go through our Dovecot first. I have access to 100% of that. It is one huge MBOX file in /var/spool/mail/USER because he does not use IMAP folders. 

mbox2mdir then I just grok each one as a file. We are done, problem solved. 

Working this from my desktop as external to the Dovecot allows me experience so that when I transfer over all email control to the provider I can do backups, external data gathering, etc using what I learned today. 

One thing I need to do on the Dovecot side, but not related is to process MOBX file though spamassassin and clamav. We do incoming backups of all and those files are huge. I was sick of telling folks I could not "undelete" things they deleted so we started incoming backups. Those are done pre-spamassassin. To correct this I need to move the rule to after spam detect and then process the data we have created before the screw up to remove the noise. 

> > I'll look for some
> > criteria and then I'll send it to a perl program that will even
> > search XLS files that are attached for criteria.

> I'm not exactly sure what benefit you're trying to give your user, but
> I'm almost positive there's a better way to do it.

Once I'm done and this continues to sort externally it will be a gift to him. The better way to do it is for him to update XLS as he receives 
this stuff. He's been against doing work since he started football in high school almost 50 years ago. :) 

> SteveT

> Steve Litt
> December 2016 featured book: Rapid Learning for the 21st Century
The most rapid learning I provide my guys is "You will not do that again will you?" 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ale.org/pipermail/ale/attachments/20170103/b85adafe/attachment.html>


More information about the Ale mailing list