During the feet-wet phase they were targeting relatively small lists, using a piece of software called Sendblaster. SB basically combines a WYSIWYG html email editor with an MS Jet database facility and an SMTP server. The problem with the pre-2000 Jet engine component is that it's limited to a 1GB database file size. Through trial and error I found that this translates to a single recipient list of ~200K records. It's also not very good about compacting its database files, so even if you faithfully delete a list before importing a new one, it's not hard to overrun the 1GB limit. SB does not warn when this happens; if the Jet engine is throwing any error messages, they're being trapped and hidden from the user. The publisher does provide a database repair tool on request, but seems uninterested in moving forward to the next MS Jet engine (would double the file size limit to 2GB), much less ditching Jet for a less constrained backend.
So when my client purchased a list of more than a handful of millions of leads, I knew that Sendblaster would be radically insufficient. Also they started making noises about multiplying the sends-per-hour rate; Sendblaster chugged along at about 1800 sends per hour, so that upper limit of 200K records per list was good for a work-week of "email marketing".
They ended up with Sendblaster because there was simply no public-license Windows software at the time that did what Sendblaster does. There were a couple open-source projects, undoubtedly well-intentioned but decidedly half-baked. And there were a handful of "free" packages that turned out to be nothing more than crippled demos of commercial projects. I despise the crippled-demo-ware model and where objective reviews could be found, Sendblaster's were as good as any. But rather than buy more SB licenses, not knowing if that 1800/hour rate was a bandwidth limit or a machine limit or a software limit, I swept the interwebs again, and came up with a short list of:
poMMo, a GPL project with an attractive mission statement, but its website hasn't been updated since mid-2008. I wonder if they ever made that leap to PHP5? sadtrombone.wav
NotOneBit Simple Mailing List. v2 is in beta; v1 is pretty sadly lacking in documentation.
OpenEMM, a commercial-grade, public-license project.
OpenEMM differs from all my previous web app experience in that instead of being a PHP app running on an xAMP stack, it's a Java servlet that runs on Tomcat. However, I'm still pretty klutzy in MySQL without a GUI, so it wasn't long before I started itching for the familiarity of phpMyAdmin. And it turned out that trying to get PHP working with Tomcat was about as frustrating a dead end as getting mod_rewrite running on whatever IIS shipped with Windows Server 2003.
So eventually I ended up installing the latest version of XAMPP as well. That project has gotten fancier with security since I last installed, and rather than puzzle through the conf files to get phpMyAdmin serving to remote browser connections, I checked out SQLyog Community Edition. I also despise the "nag screen" software model, but even without crypto and autocomplete features, SQLyog CE is a great improvement over phpMyAdmin. (Also, with the latest XAMPP I find I totally hate what they've done with the latest phpMyAdmin.)
Meanwhile, I had also been wrestling with the sheer volume of lead list data. Normally Access 2000 is my Windows database of choice, because unlike MySQL, you can plop an Access MDB file anywhere in your filesystem, burn it to DVD-R for archival, store queries and bits of VB code in a neat package right there with the data. But the 1.2GB CSV file translated into more data than Access 2000 could import to its native table format. The nearest thing Access could manage was connecting to the CSV file as a linked table.
I never had to deal with this before, but MySQL does have a language facility to import from (and to) delimited text formats: LOAD DATA INFILE ... phpMyAdmin timed out trying to process that query, because of PHP's configured default script timeout threshold. (In PHP's defense, a short-ish timeout is desirable to prevent runaway scripts mmmkay) But SQLyob ran it in under 3 minutes.
With data safe and snug in its new MySQL home, I set about extracting a subset for the pilot marketing project. Here I soon discovered that in the emm.properties file, OpenEMM sets arbitrary limits of 200K records in its recipient table, and 60K ercords in one import operation. So if I had, say, 600K leads, that would require 3 rotations of lists; or perhaps I could jerry-rig some sort of database-rotation system. I'm hoping I can manually bump these limits way up and the Java database connector doesn't plan on caching whole result sets in RAM.