Email Service Monitoring
Overview
Frequently, our customers want to monitor the health of their email service, which is helpful for answering support calls along the lines of “my email isn’t working”. This is particularly valuable if you’re using a hosted email service such as Google Apps or Microsoft Office 365, but is also helpful for monitoring your on-premises messaging services such as Exchange.
We provide monitoring for two email service health metrics:
- Email round-trip time
- Email transit time
These metrics are monitored by the Email_RoundTrip and Email_TransitTime DataSources respectively. The instructions in this article assume that you have already imported thee DataSources, which are available from the LogicMonitor public repository.
Monitoring Email Round-Trip Time
When monitoring email round-trip time, our Collector acts as a desktop email client (e.g. Outlook, Thunderbird, Mac Mail, etc.) in that it sends an outgoing messages via your SMTP service and retrieves that same message from your IMAP service1. The time to complete each step in the round-trip transaction is measured and reported, encompassing any latency within your network or in the message handoff between your SMTP service, message store, and IMAP service.
The first step to configure monitoring of email round-trip time is to add “email_rtt” as a value to the system.categories property on a device in your account2. Next, you’ll need to add email service connectivity information as custom properties for the device, outlined in the following table.
Property | Value | Example |
smtp.host | Hostname of the SMTP service | smtp.gmail.com |
smtp.type | SMTP security type: blank/SSL/TLS3 | SSL |
imap.host | Hostname of the IMAP service | imap.gmail.com |
imap.type | IMAP security type: blank/SSL/TLS | TLS |
email.user | User ID required for account authentication | testuser |
email.pass | Password required for account authentication | hello123 |
email.addr | The actual email address for this account | [email protected] |
Monitoring Email Transit Time
Our Email Transit Time DataSource measures the time it takes to deliver a message between two different email services — for example, between Gmail and Office 3654. This is particularly useful for ensuring that your email service is properly delivering and accepting messages from other service providers. Just like Email Round-Trip Time, Email Transit Time involves two protocols:
- SMTP. Sends messages from a client to a server.
- IMAP. Retrieves messages from the server to a client.
Let’s take a look at the full process for evaluating Email Transit Time using Gmail and Office 365 as an example:
- We would use SMTP to deliver a message from the Collector to provider A (e.g. Gmail)
- Still Using SMTP, we then give a destination address to a second provider (e.g. Office 365)
- Finally, we use IMAP to get the message back to the Collector from Office 365
This process highlights the two critical components of Email Service monitoring: time it takes messages sent from your business’ email server to be delivered externally, and time it takes messages sent externally to make it to your inbox. Keeping this in mind, the Email Transit Time DataSource contains an ErrorCode datapoint that will denote at which stage of the above process an error occurred (if any). The following are applicable error codes:
Setting up Email Transit Time Monitoring
To apply Email Transit Time, first you’ll need to add email_transit to the system.categories property. Next you’ll need to populate connection properties for both email services (i.e. the sending service and receiving service).
In the following table, we’ve listed the seven connection properties required for both sending and receiving services, using Gmail and Microsoft Office 365 as examples.
Property | Value | Example |
Example Properties for Gmail Service | ||
transit.gmail.imapHost | hostname for Gmail imap service (used when doing email retrieval for Gmail) |
imap.gmail.com |
transit.gmail.imapType | encryption type for Gmail imap service (used when doing email retrieval for Gmail) |
SSL (port 993) |
transit.gmail.smtpHost | hostname for Gmail smtp service (used when doing email delivery for Gmail) |
smtp.gmail.com |
transit.gmail.smtpType | encryption type for Gmail smtp service (used when doing email delivery for Gmail) |
Blank (port 25)3 SSL (port 465) TLS (port 587) |
transit.gmail.addr | email address for Gmail account | [email protected] |
transit.gmail.user | userid for Gmail account | [email protected] |
transit.gmail.pass | password for Gmail account | [email protected] |
Example Properties for Office 365 Service | ||
transit.o365.imapHost | hostname for Office 365 imap service (used when doing email retrieval for Gmail) |
outlook.office365.com |
transit.o365.imapType | encryption type for Office 365 imap service (used when doing email retrieval for Gmail) |
SSL (port 993) |
transit.o365.smtpHost | hostname for Office 365 smtp service (used when doing email delivery for Office 365) |
smtp.office365.com |
transit.o365.smtpType | encryption type for Office 365 smtp service (used when doing email delivery for Office 365) |
Blank (port 25)3 SSL (port 465) TLS (port 587) |
transit.o365.addr | email address for Office 365 account | [email protected] |
transit.o365.user | userid for Office 365 account | [email protected] |
transit.o365.pass | password for Office 365 account | [email protected] |
It is essential that the “keys” used for the Gmail and Office 365 services in the device properties above (e.g. transit.gmail.smtpHost) are consistent across each service.
Finally, once you’ve added all seven necessary device properties for each service, you’ll need to create monitoring instances for each transit direction. To do so, select Add Monitored Instance from the Manage Device drop-down menu to display the Add Monitored Instance dialog. Continuing with the example Gmail and Office 365 properties listed in the previous table, fill out the fields in this dialog as follows:
Field name | Value |
DataSource | Email Transit Time |
Name | Gmail -> Office365 |
Wildcard Value (sender:receiver) | gmail:o365 |
In the Wildcard Value field, the colon-separated keys must match the keys used to represent the sending and receiving services.
After adding an instance for one transit direction, create a second monitored instance with the following settings to create an instance that measures the reverse direction.
Field name | Value |
DataSource | Email Transit Time |
Name | Office365 -> Gmail |
Wildcard Value (sender:receiver) | o365:gmail |
Diagnostic logs for these DataSources are enabled by default, and can be found in the Collector’s “logs” directory—typically either:
/usr/local/logicmonitor/agent/logs
or"C:\Program Files (x86)\LogicMonitor\Agent\logs
Logs for Email Round-Trip Time are stored as:
- [hostname]-emailRTT-protocol.log
- [hostname]-emailRTT-debug.log
while Email Transit Time logs are:
- [hostname]-[instance]-emailTransit-protocol.log
- [hostname]-[instance]-emailTransit-debug.log
The Protocol logs show the underlying SMTP and IMAP commands issued by the Collector and the corresponding responses by the target server(s). You may need some expertise in the SMTP & IMAP protocols to understand what’s going on here, but frequently there’s lots of helpful information contained in these. Debug logs show telemetry of the DataSource itself — indicating what steps have been attempted and where it’s succeeded or failed.
Footnotes
1 Because this DataSource uses IMAP for message retrieval, in a Microsoft Exchange or Office 365 environment you’ll need to specifically enable access via IMAP to use this DataSource.
2 These measurements will take place from the perspective of the collector assigned to monitor the device on which these DataSources are applied. Meaning: the messages are sent and retrieved from the device’s Collector rather than the device itself. You can apply these DataSources to the server your Collector is installed on (if it has been added into monitoring), or you can apply the DataSources to any other monitored device you’d like.
3 TLS is disabled when smtpType is configured to use port 25. If enabling TLS is required on a resource that is configured to use port 25, an optional property named smtp.force.TLS.25 can be set to TRUE for that resource.
4 When using public email services (such as Yahoo Mail, Gmail, Outlook.com, etc.) to help monitor email transit time, be aware that the anti-spam measures employed by these services may interfere with our monitoring instrumentation. Although we’ve had good luck using the Gmail service for this purpose, message hygiene services evolve over time. As such, we can’t warrant the use of these DataSources against any public service. For best results use the email service of a business partner as a third party to validate transit times.