Monitoring

Email Service Monitoring

Frequently our customers want to monitor the health of their email service, which is helpful for answering support calls along the lines of "my email doesn't work". This is particularly valuable if you're using a hosted email service such as Google Apps or Office365, but is also helpful for monitoring your on-premise messaging services such as Exchange.

We provide two distinct options for measuring email service health: Email Round-Trip Time and Email Transit Time.  Note that the following instructions assume that you already have the Email_RoundTrip and Email_TransitTime datasources in your LogicMonitor account.  If you don't have these datasources in your account, make sure that you import them from our datasource repository.

Email Round-Trip Time

With Email Round-Trip Time our collector acts just like a desktop email client (e.g. Outlook, Thunderbird, Mac Mail, etc.) in that it sends an outgoing messages via your SMTP service and retrieves that same message from your IMAP service1. The time to complete each step in the round-trip transaction is measured and reported, encompassing any latency within your network or in the message handoff between your SMTP service, message store, and IMAP service.

The first step in adding Email Round-Trip Time to your account is to add email_rtt to the system.categories property on a device in your account2. Next you'll need to add email service connectivity information as the following device properties:

device property description example
smtp.host hostname of the smtp service smtp.gmail.com
smtp.type smtp security type: blank/SSL/TLS SSL
imap.host hostname of the imap service imap.gmail.com
imap.type imap security type: blank/SSL/TLS TLS
email.user userid required for account authentication testuser
email.pass password required for account authentication hello123
email.addr the actual email address for this account testuser@gmail.com


Email Transit Time: How it Works

Our Email Transit Time DataSource measures the time it takes to deliver a message between two different email services -- for example, between Gmail and Office 3653. This is particularly useful for ensuring that you email service is properly delivering and accepting messages from other service providers. Just like Email Round-Trip Time, Email Transit Time involves two protocols:

SMTP: sends messages from a client to a server.
IMAP: retrieves messages from the server to a client. 

Let's take a look at the full process for evaluating Email Transit Time using Gmail and Office365 as an example.

Step 1:  We would use SMTP to deliver a message from the Collector to Provider A (e.g. Gmail)
Step 2: Still Using SMTP, we then give a destination address to a second provider (e.g. Office 365)
Step 3: Finally, we use IMAP to get the message back to the Collector from Office365

This process highlights the two critical components of Email Service monitoring: time it takes messages sent from your business' email server to be delivered externally, and time it takes messages sent externally to make it to your inbox. 

Keeping this in mind, the Email Transit Time DataSource contains an ErrorCode datapoint that will denote at which stage of the above process an error occurred (if any). The following are applicable error codes: 

Error Code Meaning Description
-2 Invalid instance Unable to locate a valid service.
-1 Missing Params Basic parameters (ie. username/pass) were not available to properly validate transit time.
0 OK The roundtrip validation was successful
1 SMTP connection failure The Collector can not connect to the SMTP server.
2 SMTP send failure We connected to the SMTP server and attempted to send a message, but the server did not accept the message
3 IMAP connection failure We are unable to connect to the destination server hosting the inbox that contains our message. This is typically due to an incorrect hostname, bad userid/password, or the specified security type is incorrect.
4 IMAP search failure Each time the Email_TransitTime DataSource runs, it generates a message with a unique subject line. In order to validate the round-trip/transit time, we have to retrieve this unique message from the IMAP server. We are unable to retrieve the unique message at the destination server.
5 IMAP close failure Once we have validated the roundtrip and no longer need an IMAP connection, we send a "close" command to terminate it. This error indicates that we were unable to properly terminate the IMAP connection.

Setting up Email Transit Time monitoring

To apply Email Transit Time, first you'll need to add email_transit to the system.categories property. Next you'll need to populate connection properties for both email services. Assuming you have services "a" and "b" — say, "Gmail" and "Office365" — you'll need to add properties for each of these as follows:

transit.gmail.imapHosthostname for imap service "a"imap.gmail.comtransit.o365.imapHosthostname for imap service "b"outlook.office365.com

device property description example
transit.gmail.smtpHost hostname for smtp service "a" smtp.gmail.com
transit.gmail.smtpType encryption type for smtp service "a": blank/SSL/TLS TLS
transit.gmail.imapType encryption type for imap service "a": blank/SSL/TLS SSL
transit.gmail.addr email address for account on smtp service "a" test.acct@gmail.com
transit.gmail.user userid for account on smtp service "a" test.acct@gmail.com
transit.gmail.pass password for account on smtp service "a" Gp@ssw0rd
transit.o365.smtpHost hostname for smtp service "b" smtp.office365.com
transit.o365.smtpType encryption type for smtp service "b": blank/SSL/TLS TLS
transit.o365.imapType encryption type for imap service "b": blank/SSL/TLS SSL
transit.o365.addr email address for account on smtp service "b" other.acct@o365.com
transit.o365.user userid for account on smtp service "b" other.acct@o365.com
transit.o365.pass password for account on smtp service "b" Op@ssw0rd

It is essential that the "keys" used for service "a" and service "b" in the device properties above (e.g. transit.gmail.smtpHost) are consistent across each service.

Finally, once you've added the necessary device properties for each service, you'll need to create monitoring instances for each transit direction. To do so, select Add Monitored Instance from the Manage Device drop-down menu. In the resulting dialog box fill out the fields as follows:

field name contents
Datasource Email Transit Time
Name Gmail -> Office365
Wildcard Value gmail:o365

In the Wildcard Value field, the colon-separated keys must match the keys used to represent services "a" and "b" in the transit.* device properties you added above.

After adding an instance for one transit direction, repeat the process if you'd like to create an instance to measure the reverse direction

Notes

1. Because this datasource uses IMAP for message retrieval, in an Microsoft Exchange or Office365 environment you'll need to specifically enable access via IMAP to use this datasource.

2. These measurements will take place from the perspective of the collector assigned to monitor the device on which these datasources are applied. Meaning: the messages are sent and retrieved from the device's collector rather than the device itself.  You can apply these datasources to the server your collector is installed on (if it has been added into monitoring), or you can apply the datasources to any other monitored device you'd like.

3. When using public email services (such as Yahoo Mail, Gmail, Outlook.com, etc.) to help monitor email transit time, be aware that the anti-spam measures employed by these services may interfere with our monitoring instrumentation. Although we've had good luck using the Gmail service for this purpose, message hygiene services evolve over time. As such, we can't warrant the use of these datasources against any public service. For best results use the email service of a business partner as a third party to validate transit times.

Troubleshooting

Diagnostic logs for these DataSources are enabled by default, and can be found in the Collector's "logs" directory -- typically either

/usr/local/logicmonitor/agent/logs or C:\Program Files (x86)\LogicMonitor\Agent\logs.

Logs for Email Round-Trip Time are stored as:

  • [hostname]-emailRTT-protocol.log
  • [hostname]-emailRTT-debug.log

while Email Transit Time logs are:

  • [hostname]-[instance]-emailTransit-protocol.log
  • [hostname]-[instance]-emailTransit-debug.log

The Protocol logs show the underlying SMTP and IMAP commands issued by the Collector and the corresponding responses by the target server(s). You may need some expertise in the SMTP & IMAP protocols to understand what's going on here, but frequently there's lots of helpful information contained in these. Debug logs show telemetry of the DataSource itself -- indicating what steps have been attempted and where it's succeeded or failed.