Pular para o conteúdo principal
Base de Conhecimento da FocusVision

Bulk Details

A more comprehensive guide to email sends can be found here.

1:  General Operation

Lists should now be put under the survey directory, in a mail subdirectory (e.g. paypal/pap0666/mail/). For multi-language survey, put each mailing under each language subdirectory (e.g. paypal/pap0666/fr/mail/).

Lists should generally still be called list.txt or similar starting with list and ending with .txt. bulk will create a list.txt.log and a list.txt.report during sends. bulk will parse the old .log file and extract email addresses rather than just count the number of lines like the old system.

Reminder sends are done using a special switch, not by copying files. A reminder sends writes its log file to e.g. list.txt.rem1.log if -R1 is given.

There are no longer any *.lock or *.status file present in the same directory. List sends are still locked, so you cannot send to the same list someone else is sending to.

To send to "My Messages" accounts for eBay surveys, the command bulk [options] ebay can now be used in place of bulk-ebay utilizing the same options as "bulk send".

Survey name, client name

If an email send is client/survey/mail, it is assumed that the name of the v2 survey is "client/survey" and the name of the client is "client". If the survey has a project= attribute, that overrides the name of the client (e.g. survey gen/pap999 will have paypal/pap999 as project name; this changes the client to "paypal" rather than "gen"). You can also use the --survey switch to overide the v2 survey name, and --client to override the client.

The v2 survey name is used to check live status of the survey if doing real sends, and to validate any invited.txt files and other survey settings.

The client name is used for conditional remove lists. E.g. to get the intuit remove list you must be either running in an intuit/ directory or have used the --client=intuit command line argument or have certain keywords in your email.

2:  bulk-email.conf

The bulk-email.conf file now supports conditional remove lists. Add client=X,Y,Z for a list of clients that need to have this list applied, or keywords=X,Y,Z for a list of case-insensitive keywords that should appear in the email. If either of those is true, the remove list is applied to the send. For example: /somedir/intuit-remove.txt client=intuit keywords=intuit,quickbooks,simplestart specifies that this remove list will work only for intuit/* surveys (or those where you specified --client=intuit) or sends where the email contains any of "intuit", "quickbooks" or "simplestart" keywords.

When starting up, bulk send will show you total amount of emails in remove lists as well as a count of how many conditional remove lists there are in the configuration and how many were actually loaded. Use the -v switch to show each list as it is loaded.

v2/data/bulk-include.conf is used as normally -- these emails are sent to regardless of throttle settings and optouts. bulk-exclude-domains.conf selects domains to entirely exclude as before. Finally v2/hermes/bulk/system.conf contains system remove lists. It's not user-servicable. Currently the only special item is the intuit remove list which is used only if client is intuit, or "intuit", "quicken" or "quickbooks" appear as keywords in the email.

3:  Selecting lists - segmentation for reports or balancing

Using the -l option you can split bulk report or bulk send output by some variable, or you can use it to select sending quotas.

If you use -lfield you will create a segment for each possible non-empty value of that field. This is useful purely for information purposes -- e.g. generate a report split on some list variable.

If you use -lfield:value you create a new segment restricted to that value of the field. E.g. -llist:1 would send only to the entries where list=1, or would create a report only for those entries. You can use * as a wildcard there: -llist:* is equivalent just to -llist -- create a segment for each value of list.

If you use -lfield:value:count you can select a segment which is limited in size. E.g. -llist:1:1000 -llist:2:1000 would send out up to 2000 emails: 1000 to list=1, 1000 to list=2. You can use the wildcard here too: -llist:*:1000 would send 1000 emails to list=1, 1000 to list=2 etc.

Instead of a count, you can also use a percentage as the last argument, together with -n. E.g. -n2000 -llist:1:60% -llist:2:40% would send select 60% list=1 and 40% list=2. These are hard quotas, so it's the same as typing -llist:1:1200 -llist:2:800 -- if there are only 400 of list=2 the list=1 won't go up to 1600.

Finally you can use the * wildcard in the second and third place to evenly split the -n count among the lists. E.g. -n10000 -llist:*:*, if there are 10 different values of list, would assign 1000 to the first one, 1000 to the second one, etc.

You can split on multiple variables by separating the names and values with a comma, e.g. -la,b:1,2 would send to entries where a=1 and b=2. You can specify a ratio or count there as well, e.g. -la,b:1,2:50 -la,b:1,3:50 would send 50 emails to a=1&b=2, and another 50 to a=1&b=3.

4:  Sending reminders

You no longer have to copy the list file to send reminders. Simply use -R1 in addition to normal options to send the first reminder, -R2 for the following etc. If the list file is list.txt, using -R1 will create list.txt.rem1.log.

The -R option applies to send and report both: you can get a report for the main send or some reminder if you specify -R1.

The new bulk command automatically excludes people who have responded to the survey, provided you are in the right directory. Also, source or e-mail address should have been captured in the survey to identify respondents.

Reminder sends are only sent to people who have been successfully sent the previous send, so if you start with a -n100 soft send and then send a reminder, only at most those 100 people will be eligible for a reminder. If you later do a full send you can send more reminders. Using --reminder-for=X,Y,Z you can limit reminder send only to users in send number X, Y or Z in the previous send/reminder (see list.txt.report for the send numbers -- they start at 1). At this time, any scheduled sends using reminder-for must be 10-15 minutes apart or bulk will return an error because the list is already in use.

5:  Email throttling

In addition to the usual remove lists there are now time-limited per-client and per-survey remove lists. Each time you send an email to a user for a client, that same client may not recontact the user for a 24 hour period. The client name is simply the first path of the survey's name or the project name (if set for a survey) (in the default configuration in v2, this is disabled).

There's also throttling for the same survey. Survey throttling considers the full survey path, for example selfserve/123/1234 and selfserve/123/5678 are NOT the same survey (M11). Once you've sent an email to a user, they cannot be sent another email for that same survey for 14 days, unless it has a different reminder level (i.e. you can send the first invitation, then 3 days later a -R1 reminder, but another invitation or -R1 reminder would be blocked).

You can override the defaults by using --throttle-client=XX and --throttle-survey=XX. Specify a number of hours different than the default (24 and 336) to filter out fewer or less users. Or specify "no" to stop throttling of this type altogether (i.e. --throttle-client=no).

bulk --throttle-client=360 send email.txt list.txt

Finally there's a --full-throttle or -F switch that turns off throttling entirely.

Queueing sends (scheduling sends)

Do not use the "at" command to schedule a send. Instead, use the --at or -a option and specify the time (you must quote it if it has space in it). For example: bulk --at "tomorrow 2am" send email.txt list.txt will first go through the usual tests to validate email and list, and then schedule the send by calling at. In addition, this also records the information about the future send in the email status page and will show any other jobs that were also inserted in that table, so you may reconsider your send if someone else is also sending 100,000 emails at 1AM in the morning.

 You can no longer schedule a send less than 15 minutes of the current time, nor can you schedule a send in the past. See below for new format restrictions.

Here are a few more examples:

Regular scheduled send: bulk --at "7am jan 30" send email.txt list.txt

Soft scheduled send: bulk --at "11pm 10/21/2012" -n 500 send email.txt list.txt

Reminder scheduled send: bulk --at "10am jun 15" -R1 send email.txt list.txt

You can remove the job from the queue using atq and atrm as usual. bulk --at will also tell you the job ID it has just created. Please note that if you remove a job from the queue it will only be shown as removed from the queue in the shell and not from the mail queue page we currently have and use to monitor all email sends.

Supported date formats

Deprecation warning: Acceptable formats have been restricted.

Valid Formats:

  • TIME can be in the formats: 1pm or 1:30pm
  • YEAR can be in the formats: 08 or 2008
  • DATE can be specified as a combination of abbreviated months day year (5pm aug 24) or slashes (5pm 08/24/2008)
  • 'today' or 'tomorrow' can be used in combination with time (5pm tomorrow)
  • now + # minutes can also be used:

Examples:

  • 5pm aug 24 2008
  • 5:30pm 08/24/08
  • 5:30pm tomorrow
  • 5pm today
  • now+20minutes

To view pending sends, run bulk pending or check the mail queue page: https://v2.decipherinc.com/admin/mailqueue

Common options

Several options have two ways they can be written: either the short option (e.g. -S) or long (--shuffle). They are equivalent, but the long option may be easier to recall.

-n XXXX selects a soft-send of up to 5000 emails. Unlike previously this is 5000 sendable emails, i.e. the "Total Sent" will be 5000 while "List Total" will be larger depending on remove lists. E.g. bulk -n 5000 send email.txt list.txt

-v selects verbose operation. All remove lists are displayed as they are loaded. Usually only additional remove lists are displayed.

-S or --shuffle tells bulk to shuffle the list before examining it. The shuffle order stays the same as long as the list file is not changed, so if you do a bulk test one day and bulk send the next day, the emails selected will stay the same.

-R selects reminder level. Use -R1 for the first reminder, -R2 for the second etc. Use --reminder-for=send,send,... to restrict reminders only to recipients of those sends in the previous send (whether the original send or a reminder). Check the numbers against your previous send report to ensure that bulk is sending to the right people. At this time, any scheduled sends using reminder-for must be 10-15 minutes apart or bulk will return an error because the list is already in use.

-a or --at is used to send a send at some point in the future.

--throttle-client and --throttle-survey controls resend time (in hours) for matching client or survey. Set to "no" to ignore. ---full-throttle or -F disables any throttling whatsoever.

--rate controls how many emails get release to our mail queue per hour. Your send will complete at high speed as usual but then our mail agent will only send that many emails out per hour (so --rate 3600 will wait 1s before sending each email). This sets the minimum delay an email can arrive at a respondent's server.

-I or --invited let you control what invited.txt files bulk scans for source IDs, see below.

Options that don't take arguments can be combined. E.g. -Svn10000 is the same as -S -v -n 10000

Remember that options always go directly after bulk but before the command (i.e. send, filter, freq).

6:  Remove lists

The bulk filter, send and test commands take any number of additional remove lists as parameters. Such a remove list can be:

  • a v1 or v2 survey name (every stand-alone entry that looks like an email address and all extra variables are used for removal). Specify the survey name (v2 surveys have preference)
  • a flat text file: each line will be removed. Specify full or relative path to file.
  • a flat text file with explicit column selection. The syntax for that is flat:filename:fieldname. E.g. if you have a file with fields source, email and age, using flat:file.txt:email will use that specified field.

  • A Panel Manager database. Specify panel:mdp to get all the email addresses for MDP panel members (active or inactive)

  bulk -R1 -llist:1:2000 -llist:2:2000 send email.UK.txt list.UK.txt flat:uk-remove.txt:email
  or
  bulk send email.txt list.txt +uk-remove.txt

Only your source and email columns from your list are searched for in the remove lists unless you specify--check=column in which case that additional column is also checked against the remove list.

Note: If you have a large flat text file, you may get a memory error if you try to load it.  To solve this, you can run the following to create a md5 file.

make-hash-database file.txt

This will create a file.txt.md5 that bulk will use to index.  Note the send command is still the same.  Rerun this if your file changes.

Defaults

bulk send, test and filter will scan for a file named bulk-options in the current directory and all above it. In that you can put in (one per line) any default options. Typically that would be something like --throttle-client=48 or --throttle-client=no or /home/jaminb/extra/remove/list.txt. Lines that start with - will be put at the beginning of the command line as if you had typed them; any other at the end. Lines that begin with -- and any others will be put at the end.

For example, for a bulk-options file containing:

-- -the/survey/path

adds the -the/survey/path at the end of the options, emails are not suppressed, as the system reports removal of the list file.

7:  Suppression

Suppression normally consists of two categories:

  • Those who opt out (unsubscribes) from receiving future email sends
  • “Extra source” variables associated with respondents who completed the survey (qualified, terminated or over quota) for additional suppression

Suppression information is stored in a company's remove directory (e.g. selfserve/9d3/remove) and prevents a person from receiving future survey invitations or prevents respondents from re-entering the survey.

7.1: How Respondents are Categorized for Suppression

The below explains how a respondent is suppressed into a category by default.

7.1.1:  Opt-Outs

Respondents are categorized under 'opt-outs' by default if:

  • They request to be opted-out from any future mailings from us on the email invite. Respondents get removed by checking the multiple remove databases that we have.

7.1.2:  Additional Suppression

Respondents are categorized under 'additional suppression' by default if:

  • They are considered survey completes (qualifieds, terminates, over quotas) and checked against extra variables passed within the invite and declared in the survey (e.g. source
  • They are considered survey completes (qualifieds, terminatess, over quotas) and checked against Open-ended questions within the survey that collect email information (e.g if open end contains respondent@domain.com)
  •  If a suppression list is used within the send E.g. bulk send email.txt list.txt flat:dont-send.txt:removals
  •  Additional suppression also includes any AOL addresses that we have blocked because the user has complaint about it

7.2:  Modifying Suppression

When running bulk, you can modify exculsion so that only "extra variables" (i.e. source) are suppressed with the following syntax: 

Here, any open-ended questions that might contain email information will not be suppressed.

bulk send email.txt list.txt -abc/abs1234 v2:abc/abc1234:extra

where:

-abc/abc1234 removes all suppression (opt-outs and "extra source" variables)

v2:abc/abc1234:extra reads in a v2 survey with this path and then applies additional suppression ("extra source" variables) to the survey.

8:  Web pages, Attachments & Encoding

bulk build -- build an email from parts

bulk [-e encoding] build output-email.txt directory

This builds a MIME-complaint email, output-email.txt from the contents of the directory. The directory should contain:

  • mail.txt -- the text version of the email
  • mail.html -- the HTML version of the email
  • headers.txt -- From, Subject, To, maybe Reply-To
  • + any inline attachments (e.g. GIF files you want to refer to in the email's HTML)

If your mail.html references any images (by using <img src="...">) they are included as "inline attachments", i.e. they won't have to be downloaded from some site. Of course, that makes the email bigger. Those files should be in the email directory.

The science behind this is that the src="file.gif" gets turned into src="cid: 2342342@decipherinc.com "; and the file file.gif gets attached to the email and given the id 2342342@decipherinc.com . That way the email client can discover that the file is an inline attachment it should find among the MIME parts of the email.

You an also attach any other files you want to the email by specifying their path name, e.g.:

  • bulk build email.txt email email/somefile.pdf email/somefile.zip

That will add the files to the email.txt and mark them as "attachment".

The default assumption is that the emails you've written are in utf-8 or ascii (ASCII is a subset of utf-8). When they are encoded, they are encoded to ascii or utf-8 depending on whether they contain any special characters.

If you want to encode to something else, you use the -e switch:

bulk -e japan build ...
bulk -e china build ..
bulk -e english build ..

Using "japan" selects iso-2022-jp, using "china" selects "gb18030" and using "english" uses iso-8859-1, although that probably shouldn't be necessary -- most email clients understand utf-8 which would be selected as default.

9:  bulk freq -- check field frequencies

bulk [-n top] freq listfile.txt

For each field in the listfile.txt, freq will show you the top 10 frequently appearing values (Use -n to select more values to be shown).

10:  bulk report -- generate a report for sends

bulk [-Rx] [-l...] report listfile.txt

This command parses the log file for this list (you can specify -R to parse the reminder log file) and displays the information for each send performed as well as a total for all sends (if multiple sends have been made). The default is not to split into any segments regardless of how things were done when bulk send was used. You can use the -l switch to select any segments you want. Running this command will not affect the existing reports on disk.

11:  bulk filter -- split email list in bad/good parts

bulk [--dupes] filter list.txt good.txt bad.txt [additional remove lists]

This filters the file list.txt into good.txt (records that would be normally sent mail) and bad.txt (records that won't be sent mail because they've opted out or similar). The --dupes option below can be used with bulk filter. And by default, bad.txt will be further be categorized into the additional files: bad.txt.bounce, bad.txt.dupe, bad.txt.malformed, and bad.txt.optout.

WARNING: filter ignores throttling.

12:  bulk search -- search sends

bulk search user@domain.com will display all the sends for that email address; bulk search *@domain.com will display all sends for that domain -- the * can be but anywhere if desired. Roughly the last 3 months' sends are displayed.

bulk search @file.txt will search a file of emails if you need to search for a lot of emails.

13:  bulk send, bulk test -- send for real/test

bulk [options] send email-file list-file [additional remove lists]

Sends the given email to some or all people in the list-file. All the emails are read in first and you are presented with a report (split, if you used -l) about how many emails will actually go out. You then have the option to press Y to continue, V to view the emails that were selected (this is done using the less viewer; press spacebar/arrow keys to scroll or q to exit it). Any other key aborts the bulk mail sending.

If you specify the ---at option, pressing Y does not send but simply schedule the send. You will be told about the actual time the send will go out.

The --via=charon is mostly obsolete: to send through an unbranded IP use --type=unbranded. To send through a priority queue (only for small sends which should go out fast), use --type=priority. For developers only, --via=test will write the emails to a sendmail.out file rather than send them.

--dupes will allow dupes to get through rather than suppress them. Don't use this option with soft sends:in such a case a dupe that appears in the first send will not be sent in the second send. In such a case, split the list manually.

--ignore-blank will let you send to a list which has values missing, but will skip those respondents

After the send has finished, you are presented with the report for that particular send. The corresponding *.report file for the list is updated with it as well, and a total report if there are multiple soft sends. The *.report file will have varying splits: the total is always unsplit and the specific soft sends are split by whatever options were originally requested.

The bulk test command takes exactly the same options as bulk send but stops short of sending emails. It does not create any additional *.log/*.report file so it can be run without cleanup afterwards.

14:  bulk aol -- check AOL block status

Running bulk aol lets you see how many complaints each send in this directory or any below have received (so you can run it in e.g. v2/ebay to see all ebay complaints). Once a send receives more than 2% complaints (2% of actual sent AOL addresses), or 3% if there are less than 15 complaints, the directory becomes blocked. Any further sends to AOL addresses are skipped (without writing them to the log file, so if the block is later lifted they can be sent again).

Generally there needs to be an exceedingly good reason for lifting an AOL block: our goal is less than 1% complaints from AOL users so if 2% complain on a send that's a good sign there's something wrong: the mail is unwanted or bad. Ryan or Cory can lift an AOL block after a good reason is given, by running bulk aol allow in the correct directory. A directory can also be pre-emptively blocked by runnning bulk aol deny.

You can see overall day-by-day AOL complaint statistics on http://v2.decipherinc.com/admin/aol -- the performance of that page has recently improved and takes about 30 seconds.

15:  Variables in bulk emails

[var] references a variable from your list file. ${...} lets you run any Python code within the survey.

Typically you'll run ${date()} to insert today's date (e.g. April 07, 2008) but you can insert any Python expression. There are some predefined ones in the file v2/data/bulk-code.py: currently the only one is foreignDate("spanish") which replaces the English month name with a corresponding Spanish one. [638.]

Unsubscribe link

Emails that contain what looks like an unsubscribe link (something that has /remove in it) have that link put into a special List-Unsubscribe header. That URL is then shown to Hotmail and other users instead of "Report Spam" link, provided our email address is on a safe list -- forcing them to unsubscribe rather than report as spam. If you want a different link to be used, add a List-Unsubscribe: <http://....> link yourself. If you want no link to be used, add List-Unsubscribe:, both headers in the first part of the email that includes From etc.

16:  Blacklisting domains

Create a blacklist.txt file with the domain names you want to suppress from sends (e.g. rr.com, aol.com). This is for typically unbranded sends that cause a high complaint amount and endanger our sends to a domain.

17:  Compatibility levels

Like the survey, bulk has compatibility levels. Any email modified after a certain date must meet certain checks:

  • Mar 30, 2009: escape &lang as &amp;lang in the HTML email; must not use entities in the text/plain part (e.g. >); data must match character encoding (e.g. if you use us-ascii, you cannot use ???); invalid characters sets not allowed; unknown attachment types not allowed; characters sets must be consistent (no us-ascii in one part and latin-1 in another).

  • May 18th, 2009: If the survey (Autodetected or specified with --survey) uses unique=source or loads up any DatabaseSystem files then all the sources in your list files must appear in there. Use --invited (or -I) to fine tune this, e.g. -I somefile.txt tells bulk to verify the list sources against contents of that file (to be found in the survey directory); -I fr tells bulk that you are really sending this for the fr subsurvey and that it should verify sources from there. Use -Inone to turn off this checking.

18:  Verifying lists

By putting a verify.py file in any directory, you will force bulk to verify any lists before sends. This can be used to verify some fields are unique, or exactly so and so many characters long. Put a single verify function inside that file; you will have access to all list fields as global variables and the function will get called once for each list record. You can use assert or some builtin check_xxx functions to verify list validity. For example:

def verify():
  check_len(source, 3) # source must be exactly 3 characters long
  check_digit(list) # list must consist of digits only
  check_len(gid, 5, 7) # list must be between 5 and 7 characters long
  check_alnum(pid)     # pid must be alphanumeric (i.e. a-z 0-9 A-Z)
  check_unique(src)    # src must not be used twice (you can only use this function once)

  # You can also just call assert with a boolean value that must be true
  assert src.startswith("xx")

The verification happens every time you run bulk test or bulk send.

19:  bulk clean -- validate and clean email lists

bulk [-v] [--threshold=xx] clean list.txt [unique_list_id]

This command uses the DataValidation API to validate and clean email lists. The following options are available:

-v (optional): Outputs detailed stats as listed here: https://datavalidation.zendesk.com/hc/en-us/articles/201572353-What-do-the-codes-in-my-Email-Assurance-Report-mean-

--threshold (optional): The threshold at which a file will be cleaned. Default is 5. If more than 5% of the file includes F grades, then the file will be cleaned. Use > 100 to never clean, < 0 to always clean.

list.txt: List of email addresses to be cleaned. Must include at least an "email" column.

unique_list_id: The list ID to assign to this run of the script. If the same list ID is used with a file that was previously cleaned, it will not be cleaned again. Any time a new unique list ID is used the command is logged at [server]/admin/sqlreport/bulk_clean .