Troubleshooting: "Offline" agent

A computer status of "Offline" or "Managed (Offline)" means that the Deep Security Manager hasn't communicated with the agent's instance for some time. This exceeds the missed heartbeat threshold. (See Configure the heartbeat.) The status change can also appear in alerts and events.

Causes

Heartbeat connections can fail because:

  • Firewall, IPS rule, or security groups block the heartbeat port number
  • Bi-directional communication is enabled, but only one direction is allowed or reliable (see Configure communication directionality)
  • Computer is powered off
  • Computer has left the context of the private network
    This can occur if roaming endpoints (such as a laptop) cannot connect to Deep Security Manager at their current location. Guest Wi-Fi, for example, often restricts open ports, and has NAT when traffic goes across the Internet.
  • Amazon WorkSpace computer is being powered off, and the heartbeat interval is fast, for example, one minute; in this case, wait until the WorkSpace is fully powered off, and at that point, the status should change from 'Offine' to 'VM Stopped'
  • DNS was down, or could not resolve the Deep Security Manager's host name
  • Deep Security Manager, the agent, or both are under very high system resource load
  • Deep Security Agent process might not be running
  • Certificates for mutual authentication in the SSL or TLS connection have become invalid or revoked (see Replace the Deep Security Manager SSL certificate)
  • Deep Security Agent's or Deep Security Manager's system time is incorrect (required by SSL/TLS connections)
  • Deep Securityrule update is not yet complete, temporarily interrupting connectivity
  • On AWS EC2, ICMP traffic is required, but is blocked
If you are using manager-initiated or bi-directional communication, and are having communication issues, we strongly recommend that you change to agent-initiated activation (see Use agent-initiated communication with cloud accounts).

To troubleshoot the error, verify that the Deep Security Agent is running, and then that it can communicate with Deep Security Manager.

Verify that the agent is running

On the computer with Deep Security Agent, verify that the Trend Micro Deep Security Agent service is running. Method varies by operating system.

  • On Windows, open the Microsoft Windows Services Console (services.msc) or Task Manager. Look for the service named ds_agent.
  • On Linux, open a terminal and enter the command for a process listing. Look for the service named ds_agent or ds-agent, such as:

    sudo ps -aux | grep ds_agent

    sudo service ds_agent status

  • On Solaris, open a terminal and enter the command for a process listing. Look for the service named ds_agent, such as:

    sudo ps -ef | grep ds_agent

    sudo svcs -l svc:/application/ds_agent:default

Verify DNS

If agents connect to the Deep Security Manager via its domain name or hostname, not its IP address, test the DNS resolution:

nslookup [manager domain name]

DNS service must be reliable.

If the test fails, verify that the agent is using the correct DNS proxy or server (internal domain names can't be resolved by a public DNS server such as Google or your ISP). If a name such as dsm.example.com cannot be resolved into its IP address, communication will fail, even though correct routes and firewall policies exist for the IP address.

If the computer uses DHCP, in the computer or policy settings, in the Advanced Network Engine area, you might need to enable Force Allow DHCP DNS(see Network engine settings).

Allow outbound ports (agent-initiated heartbeat)

Telnet to required port numbers on Deep Security Manager to verify that a route exists, and the port is open:

telnet [manager IP]:4120

Telnet success proves most of the same things as a ping: that a route and correct firewall policy exist, and that Ethernet frame sizes are correct. (Ping is disabled on computers that use the default security policy for Deep Security Manager. Networks sometimes block ICMP ping and traceroute to block attackers' reconnaissance scans. So usually, you can't ping the Manager to test.)

If telnet fails, trace the route to discover which point on the network is interrupting connectivity. Methods vary by operating system.

  • On Linux, enter the command:

    traceroute [agent IP]

  • On Windows, enter the command:

    tracert [agent IP]

Adjust firewall policies, routes, NAT port forwarding, or all three to correct the problem. Verify both network and host-based firewalls, such as Windows Firewall and Linux iptables. For an AWS EC2 instance, see Amazon's documentation on Amazon EC2 Security Groups for Linux Instances or Amazon EC2 Security Groups for Windows Instances. For an Azure VM instance, see Microsoft's Azure documentation on modifying a Network Security Group.

If connectivity tests from the agent to the manager succeed, then next you must test connectivity in the other direction. (Firewalls and routers often require policy-route pairs to allow connectivity. If only 1 of the 2 required policies or routes exist, then packets will be allowed in one direction, but not the other.)

Allow inbound ports (manager-initiated heartbeat)

On the Deep Security Manager, ping the Deep Security Agent and telnet to the heartbeat port number to verify that heartbeat and configuration traffic can reach the agent:

ping [agent IP]

telnet [agent IP]:4118

If the ping and telnet fail, use:

traceroute [agent IP]

to discover which point on the network is interrupting connectivity. Adjust firewall policies, routes, NAT port forwarding, or all three to correct the problem.

If IPS or firewall rules are blocking the connection between the Deep Security Agent and the Deep Security Manager, then the manager cannot connect in order to unassign the policy that is causing the problem. To solve this, enter the command on the computer to reset policies on the agent:

dsa_control -r

You must re-activate the agent after running this command.

Allow ICMP on Amazon AWS EC2 instances

In the AWS cloud, routers require ICMP type 3 code 4. If this traffic is blocked, connectivity between agents and the manager may be interrupted.

You can force allow this traffic in Deep Security. Either create a firewall policy with a force allow, or in the computer or policy settings, in the Advanced Network Engine area, enable Force Allow ICMP type3 code4 (see Network engine settings).