SPAM: now with real meat

For general rambling.
Post Reply
quantus
Tenth Dan Procrastinator
Posts: 4891
Joined: Fri Jul 18, 2003 3:09 am
Location: San Jose, CA

SPAM: now with real meat

Post by quantus »

So, how much spam have you filtered over the last year or so? I've been filtering my email since December 23rd or so last year and I've caught 2375 pieces of spam with my sieve script. Spam assassin support was added around March to aid in capturing all this spam. How much spam have you caught?
Have you clicked today? Check status, then: People, Jobs or Roads

Jonathan
Grand Pooh-Bah
Posts: 6722
Joined: Tue Sep 19, 2006 8:45 pm
Location: Portland, OR
Contact:

Post by Jonathan »

Unfortunately, I haven't got continuity in my data. I changed email addresses in that time frame. Also, dwindlehop@cmu.edu expired, so I ceased receiving all email directed to that address. Also, I changed spam filters from SpamAssassin plus Bogofilter to Mozilla Mail's Bayesian filter. Also, I read all my email on my Hiptop, anyway, which has no filter support. Uh, yeah. So basically, I have no idea.

I will say that keeping your email address a running target does wonders for your spam intake. I will also say that I believe having a weird TLD is also beneficial, spamwise, because many scripts don't recognize .name as a TLD.

How much legitimate email did you acquire in the same timeframe? What was the ratio of spam to ham? What was the positive result percentage (spams tagged as spam)? What was percentage of spam that got through? What was the percentage of good email that got tagged as spam?

Are you deleting spam, directing to a separate file or directory, or other? I would like to bounce, but sometimes I think that I will just wind up sending junk to whoever's email address got spoofed.

quantus
Tenth Dan Procrastinator
Posts: 4891
Joined: Fri Jul 18, 2003 3:09 am
Location: San Jose, CA

Post by quantus »

The problem with changing your email address often is much like the problem of changing your cell phone number or even address often. You need to tell everyone the new address or number. Yes, this will keep you from getting spam. An interesting side note would be that changing your telephone number might actually cause you to get more telemarketers.

The strange TLD is also a pain when trying to enter your email address into some forms because they'll reject it saying that it's an invalid email address.

I haven't gone through to count the amount of legitimate mail in my junk folder, but it's quite low. I'll also say that the amount of real mail making it to my spam folder lately is almost non-existant. Most of it got in there nearer the beginning when I was still tuning the script. Pretty much all of it is because someone was sending mail from hotmail or some other known spam domain and I didn't have their address to snatch it away from the meat grinder.

I am not deleting spam and just filing it into a junk folder for brief manual inspection. The problem with bouncing or rejecting is that you may lose mail that was actually meant for you sometimes. I learned that rejected mail is not garunteed to be delivered to the rejectee as part of the general mail handling spec. Also, as you said, you'll likely just spam the person whose address got spoofed. Finally, if someone rejects the rejected mail back to you, it will bypass your script and be delivered anyway.

The way I file email from certain people to special folders with sieve makes counting how much legitimate mail I recieved a difficult statistic to extract. I can tell you that so far this month, only one piece of mail has evaded my filter and that was on October 1st. Also, to put this in context of the amount of spam I get, it's about 15-20 pieces of spam a day. So, this month so far, less than .5% of spam has beaten my filter.

Spam assassin by itself is not a good enough filter for certain and as such, I have several other rules looking into the header trying to find spam. My most recent addition is to actually look at the mail servers that the piece of mail went through. If it hits a server in another country, it's filtered as spam.

I'm starting to move away from the huge long list of known domains that spam has come from in the past in favor of smarter rules like the one above. However, I still maintain most of this list since there are quite a lot of spammers in this country as well. To help limit the size of this list, I've started looking for common words that spammers like to include in their domains and filter on those in place of full domains containing those words. I'm still considering this problem of just too many domains every once in a while to see if I can come up with a better solution.
Have you clicked today? Check status, then: People, Jobs or Roads

Jonathan
Grand Pooh-Bah
Posts: 6722
Joined: Tue Sep 19, 2006 8:45 pm
Location: Portland, OR
Contact:

Post by Jonathan »

quantus wrote:The problem with changing your email address often is much like the problem of changing your cell phone number or even address often. You need to tell everyone the new address or number. Yes, this will keep you from getting spam. An interesting side note would be that changing your telephone number might actually cause you to get more telemarketers.

The strange TLD is also a pain when trying to enter your email address into some forms because they'll reject it saying that it's an invalid email address.
I was being facetious, but I'll allow it wasn't obvious enough. Changing your email address is bad, not having a TLD recognized by most scripts is bad. Using a forwarding address (jonathan@pearce.name) mitigates this somewhat.

It sounds like you should be tuning SpamAssassin, not adding generic patterns to sieve scripts. SpamAssassin has all the pattern recognition going on already. If stuff is still getting through it, then you should increase the weights of the specific tests which your spam hits. Or you should increase the sensitivity of SpamAssassin to the point where it starts marking your spam as spam. Or you should contribute heuristics to their test list if you have something clever that they don't do.

Heh, Kerry Wood just hit a 3 run home run to tie the game.

Alan
Veteran Doodler
Posts: 2758
Joined: Fri Jul 18, 2003 2:32 am
Location: Where I am
Contact:

Post by Alan »

Why can't the Marlins just lose like good little bitches.

Same with the Yankees.
Image

Alan
Veteran Doodler
Posts: 2758
Joined: Fri Jul 18, 2003 2:32 am
Location: Where I am
Contact:

Post by Alan »

Looks like the Cubs will suck forever.
Image

Jason
Veteran Doodler
Posts: 1520
Joined: Fri Jul 18, 2003 12:53 am
Location: Fairfax, VA

Post by Jason »

Alan wrote:Looks like the Cubs will suck forever.
yep, that fan is in deep shit.

Alan
Veteran Doodler
Posts: 2758
Joined: Fri Jul 18, 2003 2:32 am
Location: Where I am
Contact:

Post by Alan »

I was surprised they didn't lynch him right then and there.
Image

Jonathan
Grand Pooh-Bah
Posts: 6722
Joined: Tue Sep 19, 2006 8:45 pm
Location: Portland, OR
Contact:

Post by Jonathan »

Alan wrote:Why can't the Marlins just lose like good little bitches.

Same with the Yankees.
Goddamn, seriously. If the Marlins and the Yankees could just both lose, I'd be happy.

quantus
Tenth Dan Procrastinator
Posts: 4891
Joined: Fri Jul 18, 2003 3:09 am
Location: San Jose, CA

Post by quantus »

Dwindlehop wrote:It sounds like you should be tuning SpamAssassin, not adding generic patterns to sieve scripts. SpamAssassin has all the pattern recognition going on already. If stuff is still getting through it, then you should increase the weights of the specific tests which your spam hits. Or you should increase the sensitivity of SpamAssassin to the point where it starts marking your spam as spam. Or you should contribute heuristics to their test list if you have something clever that they don't do.
You know, I would do this, but I'm pretty sure I don't have access to SpamAssassin's config to do this. I only have access to sieve.
Have you clicked today? Check status, then: People, Jobs or Roads

Jonathan
Grand Pooh-Bah
Posts: 6722
Joined: Tue Sep 19, 2006 8:45 pm
Location: Portland, OR
Contact:

Post by Jonathan »

quantus wrote:
Dwindlehop wrote:It sounds like you should be tuning SpamAssassin, not adding generic patterns to sieve scripts. SpamAssassin has all the pattern recognition going on already. If stuff is still getting through it, then you should increase the weights of the specific tests which your spam hits. Or you should increase the sensitivity of SpamAssassin to the point where it starts marking your spam as spam. Or you should contribute heuristics to their test list if you have something clever that they don't do.
You know, I would do this, but I'm pretty sure I don't have access to SpamAssassin's config to do this. I only have access to sieve.
Ha!
This is the current list of tests SpamAssassin(tm) performs on mail messages to determine if they're spam or not. If you wish to change the score from the default, add a line like this to your ~/.spamassassin/user_prefs:

score NAME_OF_TEST 3.0

quantus
Tenth Dan Procrastinator
Posts: 4891
Joined: Fri Jul 18, 2003 3:09 am
Location: San Jose, CA

Post by quantus »

Ok, the only configuration that's allowed is similar to what's available on yahoo! mail which is the ability to specify addresses and domains which definately are/are not spammers. Of course sieve allows me to do the same thing in a much more powerful manner. Yes, I could just forward my mail to my own personal machine with SpamAssassin installed and tuned to my needs, but that's too much effort for now. The band-aid approach I'm using is doing quite well.
Have you clicked today? Check status, then: People, Jobs or Roads

Jonathan
Grand Pooh-Bah
Posts: 6722
Joined: Tue Sep 19, 2006 8:45 pm
Location: Portland, OR
Contact:

Post by Jonathan »

quantus wrote:Ok, the only configuration that's allowed is similar to what's available on yahoo! mail which is the ability to specify addresses and domains which definately are/are not spammers. Of course sieve allows me to do the same thing in a much more powerful manner. Yes, I could just forward my mail to my own personal machine with SpamAssassin installed and tuned to my needs, but that's too much effort for now. The band-aid approach I'm using is doing quite well.
Huh? I'm no longer arguing that you should stop tuning Sieve. I'm just curious as to what you're saying.

You can change the weight of any of the SpamAssassin tests. Like "Character set indicates a foreign language"; "Razor2 gives confidence between 51 and 100"; "Message-Id is fake (in Outlook Express format)"; "Talks about millions of dollars"; or any of the other bazillion SpamAssassin tests. Click on the link and read the list.

quantus
Tenth Dan Procrastinator
Posts: 4891
Joined: Fri Jul 18, 2003 3:09 am
Location: San Jose, CA

Post by quantus »

Dwindlehop wrote:
quantus wrote:Ok, the only configuration that's allowed is similar to what's available on yahoo! mail which is the ability to specify addresses and domains which definately are/are not spammers. Of course sieve allows me to do the same thing in a much more powerful manner. Yes, I could just forward my mail to my own personal machine with SpamAssassin installed and tuned to my needs, but that's too much effort for now. The band-aid approach I'm using is doing quite well.
Huh? I'm no longer arguing that you should stop tuning Sieve. I'm just curious as to what you're saying.

You can change the weight of any of the SpamAssassin tests. Like "Character set indicates a foreign language"; "Razor2 gives confidence between 51 and 100"; "Message-Id is fake (in Outlook Express format)"; "Talks about millions of dollars"; or any of the other bazillion SpamAssassin tests. Click on the link and read the list.
I saw the link and have read the different tests. There's no way I can get to that config to edit the values used by Cyrus. The values are set for everyone.
Have you clicked today? Check status, then: People, Jobs or Roads

Jonathan
Grand Pooh-Bah
Posts: 6722
Joined: Tue Sep 19, 2006 8:45 pm
Location: Portland, OR
Contact:

Post by Jonathan »

You change ~/.spamassassin/user_prefs. That's a per-user config file to override the server config. I don't see the problem.

Jonathan
Grand Pooh-Bah
Posts: 6722
Joined: Tue Sep 19, 2006 8:45 pm
Location: Portland, OR
Contact:

Post by Jonathan »

quantus wrote:I saw the link and have read the different tests. There's no way I can get to that config to edit the values used by Cyrus. The values are set for everyone.
Hmm. Are you saying you have no login on your mail server? That's... interesting.

Jonathan
Grand Pooh-Bah
Posts: 6722
Joined: Tue Sep 19, 2006 8:45 pm
Location: Portland, OR
Contact:

Post by Jonathan »

Yeah, you're right. With CMU's mail setup, you have no ability to tune SA. Suck.

Jonathan
Grand Pooh-Bah
Posts: 6722
Joined: Tue Sep 19, 2006 8:45 pm
Location: Portland, OR
Contact:

Post by Jonathan »

Write your friendly Andrew admin and tell them you want to configure your SA tests!

quantus
Tenth Dan Procrastinator
Posts: 4891
Joined: Fri Jul 18, 2003 3:09 am
Location: San Jose, CA

Post by quantus »

Dwindlehop wrote:You change ~/.spamassassin/user_prefs. That's a per-user config file to override the server config. I don't see the problem.
Agreed, that is the file, but I'm 99% sure that it would have to reside on cyrus.andrew.cmu.edu and not in my andrew account. This is especially true since there's no info on how to configure SpamAssassin on the computing info page while there is info about Sieve there. I will try it though just because you want me to. I'll try upping the score for ROT13 email address to 5 and see if that generates a X-SpamWarning.
Have you clicked today? Check status, then: People, Jobs or Roads

quantus
Tenth Dan Procrastinator
Posts: 4891
Joined: Fri Jul 18, 2003 3:09 am
Location: San Jose, CA

Post by quantus »

Dwindlehop wrote:Write your friendly Andrew admin and tell them you want to configure your SA tests!
There is the 1% chance the did point ~/ to the andrew home directory for the addressee
Have you clicked today? Check status, then: People, Jobs or Roads

Post Reply