Sunday, December 28, 2008

Java and SSL Certificates

Overview


I have a Linux box at home, running Apache 2.2, that I use to archive pictures. I use an application called Gallery as a front end to organize and view the photos. I'm using a Java application called Gallery Remote to upload pictures to the server. I've also added SSL encryption so that the username and password, to access the site, are not sent in the clear.

Problem and Solution


The problem was that Gallery Remote wasn't able to connect to the server. It seemed to be having problems with the SSL certificate I had on the web server. I was using a self-signed SSL certificate, so that was definitely possible. I checked out the SSL certificate and found that it was expired. I regenerated a new certificate using the instructions on the Apache Website. The relevant text is shown below.

How do I create a self-signed SSL Certificate for testing purposes?

  1. Make sure OpenSSL is installed and in your PATH.

  2. Run the following command, to create server.key and
    server.crt files:
    $ openssl req -new -x509 -nodes -out server.crt
    -keyout server.key

    These can be used as follows in your httpd.conf
    file:
                 SSLCertificateFile    /path/to/this/server.crt
    SSLCertificateKeyFile /path/to/this/server.key


  3. It is important that you are aware that this
    server.key does not have any passphrase.
    To add a passphrase to the key, you should run the following
    command, and enter & verify the passphrase as requested.

    $ openssl rsa -des3 -in server.key -out
    server.key.new

    $ mv server.key.new server.key


    Please backup the server.key file, and the passphrase
    you entered, in a secure location.



After restarting the webserver, I was still having problems with Gallery Remote. I then found out that Java has it's own repository of trusted SSL certificates. My SSL certificate was a self-signed certificate, so it definitely wasn't in the default SSL Certificate trust list. One method of adding the certificate is by going through the Java control panel. Another method is to add it through the command line. This was described on the Gallery Remote FAQ page. The relevant text is shown below.

Using HTTPS

You can use https:// URLs with Gallery Remote to connect to secured web sites. This functionality is only available on Java 1.4 and later. If the site you are attempting to connect to uses a server certificate that is not certified by a trusted certificate authority, Gallery Remote will be unable to connect. If this happens, you will need to add the site's certificate to the Java registry of trusted certificates:

For Windows:
  1. Go to the site with Internet Explorer
  2. Go to menu File>Properties
  3. In the Properties window, click Certificates
  4. On the Details tab, click Copy to File...
  5. In the wizard, select DER-encoded X.509 certificate and save it to a
    file
  6. Open a console window (cmd.exe)
  7. Type the following command-line:

     keytool -import -trustcacerts -file path_to_cer_file 
-keystore %JAVA_HOME%/jre/lib/security/cacerts -alias arbitrary_name

You'll be prompted for the store password, which by default is
changeit


I used that and it worked. It's interesting to note that Java uses it's own keystore and that there's a default password used if using the command line.

Tuesday, December 16, 2008

Allowing CSA Management Center to access WSUS server

The pre-configured CSA 6.0 policy for the CSA Management Center does not allow for connectivity to the WSUS server. Rule 269 blocks the access as shown below



You'll notice that the screenshot is from the events shown on the agent GUI on the management center. This is because rule 269 does not log by default. Because of this, the denied packets do not show up in the management center event logs. In order to view the logs on the management center, you would need to do one of two things:

Explicitly turn on logging for rule 269


Enable log overrides for a particular group


Once the denied rule shows up in the management center event logs, the denied events can be viewed on the management center. This helps with the troubleshooting process.

The problem is that rule 269 blocks all network traffic not explicitly allowed by another rule. Since rule 269 applies to the "CSA MC Network Security Module", it only affects the management center. This is why WSUS updates work fine with the pre-configured server and desktop rules. In those policies, there is no rule explicitly blocking network traffic. The default action is to allow traffic, so the WSUS update traffic is allowed for desktops and servers.

There are a number of ways to fix the problem for the management center. The easiest method is to use the Wizard in the event log entry for rule 269. The Wizard provides a method of easily creating an exception rule for the specific traffic that was blocked.

The first step is to locate the rule 269 event log entry and click on the Wizard link. This is shown in the red oval in the diagram below


The next step is to click on the "Allow Operation" radio button, provide a justification and click "Finish". This is shown below.


After "Finish" is clicked, the necessary variables and rule are created. The next step is to generate the policy to install the rules. The diagram below shows the variables and rules that will be generated.


After the rule generation, there should now be an exception rule that allows access to the WSUS server to get Microsoft updates. This is shown below.


A closer inspection of the rule shows that it is a granular rule only allowing executable "svchost.exe -k netsvc" to talk to the WSUS server, as a client, on port 80/tcp. This is shown below.


To verify that the rule is really working, you can temporarily turn on logging for the exception rule. This is shown below.


A reboot of the management center should kick off the WSUS update check again. Once this is completed, something similar to the following should be in the event log


After verifying the success of the exception rule, make sure to turn off logging on the exception rule and any other logging that was turned on for troubleshooting purposes above.

Monday, December 15, 2008

Cisco CSA 6.0 Upgrade Note

When upgrading to CSA 6.0, most of the effort is concentrated on upgrading the management center. Cisco provides fairly well documented instructions in their installation guide. The part they don't talk about enough is the CSA Agent upgrade to 6.0.

One problem I ran into has to do with upgrading clients running Windows XP SP3. According to the CSA 5.2 release notes, CSA 5.2 only supports Windows XP SP 0, 1, or 2. Of course, you can surmise that they just forgot to update their release notes with XP SP3 support, since the documentation is dated 4/2/07 and Windows XP SP3 came out on 5/6/08. Unfortunately, that is not true. This can be seen when viewing the "Host Identification" information under "Systems > Hosts > [hostname]". I've shown an example below. The "unsupported" information is shown in red.



Despite this screen, the CSA 5.2 Agent works fine after the upgrade to XP SP3. The big problem comes when the Management Center is upgraded to version 6.0 and you try to do the scheduled software update to upgrade all the agents to 6.0. This does not work. The normal process is to
  1. Access "Systems > Software Updates > Scheduled Software Updates"
  2. Create a new Item that schedules the update for a particular group
  3. The agents check in with the Management Center, download the update, and install
The problem is that the agents never download the update. The "System > Hosts > [hostname]" page always shows that the software version as "Agent is running the latest software" instead of "Update Available". Both of these screenshots are shown below.





The only workaround is to create a new CSA 6.0 agent kit and push the new agent kit to all the users via your normal application installation mechanism (Altiris, SMS,...)

Sunday, December 7, 2008

Cisco NAC Manager HA log files

The NAC Manager documentation provides a number of logs that can be viewed to troubleshoot various issues. The only problem is that when a problem occurs it would be really nice to have a reference showing what good log output looks like. That's what I'd like to share here. Hopefully this will help someone troubleshooting an issue.

The log files below show the log files on a standby NAM when it becomes active
/perfigo/control/tomcat/logs/localhost_log..txt

2008-11-19 14:22:58 StandardHost[localhost]: Removing web application at context path /admin
2008-11-19 14:22:58 StandardHost[localhost]: Removing web application at context path
2008-11-19 14:23:03 WebappLoader[/admin]: Deploying class repositories to work directory /perfigo/control/tomcat/work/Standalone/localhost/admin
2008-11-19 14:23:03 WebappLoader[/admin]: Deploy JAR /WEB-INF/lib/jsf_hack_tld.jar to /perfigo/control/tomcat/webapps/admin/WEB-INF/lib/jsf_hack_tld.jar
2008-11-19 14:23:04 ContextConfig[/admin]: Configured an authenticator for method NONE
2008-11-19 14:23:04 PersistentManager[/admin]: Seeding random number generator class java.security.SecureRandom
2008-11-19 14:23:04 PersistentManager[/admin]: Seeding of random number generator has been completed
2008-11-19 14:23:04 PersistentManager[/admin]: No Store configured, persistence disabled
2008-11-19 14:23:22 StandardWrapper[/admin:default]: Loading container servlet default
2008-11-19 14:23:22 StandardWrapper[/admin:invoker]: Loading container servlet invoker
2008-11-19 14:23:22 HostConfig[localhost]: Deploying web application directory ROOT
2008-11-19 14:23:22 StandardHost[localhost]: Installing web application at context path from URL file:/perfigo/control/tomcat/normal-webapps/ROOT
2008-11-19 14:23:22 WebappLoader[]: Deploying class repositories to work directory /perfigo/control/tomcat/work/Standalone/localhost/_
2008-11-19 14:23:22 StandardManager[]: Seeding random number generator class java.security.SecureRandom
2008-11-19 14:23:22 StandardManager[]: Seeding of random number generator has been completed
2008-11-19 14:23:23 StandardWrapper[:default]: Loading container servlet default
2008-11-19 14:23:23 StandardWrapper[:invoker]: Loading container servlet invoker
2008-11-19 14:23:23 HostConfig[localhost]: Deploying web application directory upload
2008-11-19 14:23:23 StandardHost[localhost]: Installing web application at context path /upload from URL file:/perfigo/control/tomcat/normal-webapps/upload
2008-11-19 14:23:23 WebappLoader[/upload]: Deploying class repositories to work directory /perfigo/control/tomcat/work/Standalone/localhost/upload
2008-11-19 14:23:23 StandardManager[/upload]: Seeding random number generator class java.security.SecureRandom
2008-11-19 14:23:23 StandardManager[/upload]: Seeding of random number generator has been completed
2008-11-19 14:23:23 StandardWrapper[/upload:default]: Loading container servlet default
2008-11-19 14:23:23 StandardWrapper[/upload:invoker]: Loading container servlet invoker
2008-11-19 14:23:23 HostConfig[localhost]: Deploying web application directory wlan
2008-11-19 14:23:23 StandardHost[localhost]: Installing web application at context path /wlan from URL file:/perfigo/control/tomcat/normal-webapps/wlan
2008-11-19 14:23:23 WebappLoader[/wlan]: Deploying class repositories to work directory /perfigo/control/tomcat/work/Standalone/localhost/wlan
2008-11-19 14:23:23 ContextConfig[/wlan]: Configured an authenticator for method NONE
2008-11-19 14:23:23 StandardManager[/wlan]: Seeding random number generator class java.security.SecureRandom
2008-11-19 14:23:23 StandardManager[/wlan]: Seeding of random number generator has been completed
2008-11-19 14:23:23 StandardWrapper[/wlan:default]: Loading container servlet default
2008-11-19 14:23:23 StandardWrapper[/wlan:invoker]: Loading container servlet invoker
2008-11-19 14:23:23 HostConfig[localhost]: Deploying web application directory packages
2008-11-19 14:23:23 StandardHost[localhost]: Installing web application at context path /packages from URL file:/perfigo/control/tomcat/normal-webapps/packages
2008-11-19 14:23:23 WebappLoader[/packages]: Deploying class repositories to work directory /perfigo/control/tomcat/work/Standalone/localhost/packages
2008-11-19 14:23:23 StandardManager[/packages]: Seeding random number generator class java.security.SecureRandom
2008-11-19 14:23:23 StandardManager[/packages]: Seeding of random number generator has been completed
2008-11-19 14:23:24 StandardWrapper[/packages:default]: Loading container servlet default
2008-11-19 14:23:24 StandardWrapper[/packages:invoker]: Loading container servlet invoker
2008-11-19 14:23:24 HostConfig[localhost]: Deploying web application directory download
2008-11-19 14:23:24 StandardHost[localhost]: Installing web application at context path /download from URL file:/perfigo/control/tomcat/normal-webapps/download
2008-11-19 14:23:24 WebappLoader[/download]: Deploying class repositories to work directory /perfigo/control/tomcat/work/Standalone/localhost/download
2008-11-19 14:23:24 StandardManager[/download]: Seeding random number generator class java.security.SecureRandom
2008-11-19 14:23:24 StandardManager[/download]: Seeding of random number generator has been completed
2008-11-19 14:23:24 StandardWrapper[/download:default]: Loading container servlet default
2008-11-19 14:23:24 StandardWrapper[/download:invoker]: Loading container servlet invoker

/var/log/ha-log
heartbeat: 2008/11/19_14:22:58 info: Received shutdown notice from 'camanager1'.
heartbeat: 2008/11/19_14:22:58 info: Resources being acquired from camanager1.
heartbeat: 2008/11/19_14:22:58 info: acquire all HA resources (standby).
heartbeat: 2008/11/19_14:22:58 info: No local resources [/usr/lib64/heartbeat/ResourceManager listkeys camanager2] to acquire.
heartbeat: 2008/11/19_14:22:58 info: Acquiring resource group: camanager1 x.x.x.x controlsmart
heartbeat: 2008/11/19_14:22:58 info: Running /etc/ha.d/resource.d/IPaddr x.x.x.x start
heartbeat: 2008/11/19_14:22:58 info: /sbin/ifconfig eth0:0 x.x.x.x netmask 255.255.255.0 broadcast 172.31.31.255
heartbeat: 2008/11/19_14:22:58 info: Sending Gratuitous Arp for x.x.x.x on eth0:0 [eth0]
heartbeat: 2008/11/19_14:22:58 /usr/lib64/heartbeat/send_arp -i 1010 -r 5 -p /var/lib/heartbeat/rsctmp/send_arp/send_arp-x.x.x.x eth0 x.x.x.x auto x.x.x.x ffffffffffff
heartbeat: 2008/11/19_14:22:58 info: Running /perfigo/control/bin/controlsmart start
heartbeat: 2008/11/19_14:23:02 info: all HA resource acquisition completed (standby).
heartbeat: 2008/11/19_14:23:02 info: Standby resource acquisition done [all].
heartbeat: 2008/11/19_14:23:02 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2008/11/19_14:23:04 info: Taking over resource group x.x.x.x
heartbeat: 2008/11/19_14:23:04 info: Acquiring resource group: camanager1 x.x.x.x controlsmart
heartbeat: 2008/11/19_14:23:04 info: Running /perfigo/control/bin/controlsmart start
heartbeat: 2008/11/19_14:23:04 info: /usr/lib64/heartbeat/mach_down: nice_failback: foreign resources acquired
heartbeat: 2008/11/19_14:23:04 info: mach_down takeover complete.
heartbeat: 2008/11/19_14:23:04 info: mach_down takeover complete for node camanager1.
heartbeat: 2008/11/19_14:23:14 WARN: node camanager1: is dead
heartbeat: 2008/11/19_14:23:14 info: Dead node camanager1 gave up resources.
heartbeat: 2008/11/19_14:23:14 info: Link camanager1:eth1 dead.

These logs show the story when a NAM starts as the active NAM and then "service perfigo stop" is entered to turn off the NAC service
/perfigo/control/tomcat/logs/localhost_log..txt
2008-11-19 14:22:55 StandardHost[localhost]: Removing web application at context path /admin
2008-11-19 14:22:55 StandardHost[localhost]: Removing web application at context path /upload
2008-11-19 14:22:55 StandardHost[localhost]: Removing web application at context path /download
2008-11-19 14:22:55 StandardHost[localhost]: Removing web application at context path /packages
2008-11-19 14:22:55 StandardHost[localhost]: Removing web application at context path /wlan
2008-11-19 14:22:55 StandardHost[localhost]: Removing web application at context path

/var/log/ha-debug
heartbeat: 2008/11/19_14:22:54 info: Heartbeat shutdown in progress. (4516)
heartbeat: 2008/11/19_14:22:54 info: Giving up all HA resources.
heartbeat: 2008/11/19_14:22:54 info: Releasing resource group: camanager1 x.x.x.x controlsmart
heartbeat: 2008/11/19_14:22:54 info: Running /perfigo/control/bin/controlsmart stop
heartbeat: 2008/11/19_14:22:58 info: Running /etc/ha.d/resource.d/IPaddr x.x.x.x stop
heartbeat: 2008/11/19_14:22:58 info: /sbin/route -n del -host x.x.x.x
heartbeat: 2008/11/19_14:22:58 info: /sbin/ifconfig eth0:0 down
heartbeat: 2008/11/19_14:22:58 info: IP Address x.x.x.x released
heartbeat: 2008/11/19_14:22:58 info: All HA resources relinquished.
heartbeat: 2008/11/19_14:22:59 info: killing HBREAD process 4521 with signal 15
heartbeat: 2008/11/19_14:22:59 info: killing HBFIFO process 4519 with signal 15
heartbeat: 2008/11/19_14:22:59 info: killing HBWRITE process 4520 with signal 15
heartbeat: 2008/11/19_14:22:59 info: Core process 4519 exited. 3 remaining
heartbeat: 2008/11/19_14:22:59 info: Core process 4520 exited. 2 remaining
heartbeat: 2008/11/19_14:22:59 info: Core process 4521 exited. 1 remaining
heartbeat: 2008/11/19_14:22:59 info: Heartbeat shutdown complete.

This show the status after "service perfigo start" is entered with another NAM active
[root@camanager1 logs]# service perfigo start
Starting High-Availability services:
[ OK ]
Please wait while bringing up service IP.
Heartbeat service is running.
Service IP is up on the peer node.
Stopping postgresql service: [ OK ]
Starting postgresql service: [ OK ]
DROP DATABASE
CREATE DATABASE
DROP DATABASE
CREATE DATABASE
Database synced
[root@camanager1 logs]#

/perfigo/control/tomcat/logs/localhost_log..txt
2008-11-19 14:30:14 WebappLoader[/admin]: Deploying class repositories to work directory /perfigo/control/tomcat/work/Standalone/localhost/admin
2008-11-19 14:30:14 WebappLoader[/admin]: Deploy JAR /WEB-INF/lib/jsf_hack_tld.jar to /perfigo/control/tomcat/webapps/admin/WEB-INF/lib/jsf_hack_tld.jar
2008-11-19 14:30:14 ContextConfig[/admin]: Configured an authenticator for method NONE
2008-11-19 14:30:14 PersistentManager[/admin]: Seeding random number generator class java.security.SecureRandom
2008-11-19 14:30:14 PersistentManager[/admin]: Seeding of random number generator has been completed
2008-11-19 14:30:14 PersistentManager[/admin]: No Store configured, persistence disabled
2008-11-19 14:30:15 StandardWrapper[/admin:default]: Loading container servlet default
2008-11-19 14:30:15 StandardWrapper[/admin:invoker]: Loading container servlet invoker
2008-11-19 14:30:15 HostConfig[localhost]: Deploying web application directory ROOT
2008-11-19 14:30:15 StandardHost[localhost]: Installing web application at context path from URL file:/perfigo/control/tomcat/admin-webapps/ROOT
2008-11-19 14:30:15 WebappLoader[]: Deploying class repositories to work directory /perfigo/control/tomcat/work/Standalone/localhost/_
2008-11-19 14:30:15 StandardManager[]: Seeding random number generator class java.security.SecureRandom
2008-11-19 14:30:15 StandardManager[]: Seeding of random number generator has been completed
2008-11-19 14:30:16 StandardWrapper[:default]: Loading container servlet default
2008-11-19 14:30:16 StandardWrapper[:invoker]: Loading container servlet invoker

/var/log/ha-log
heartbeat: 2008/11/19_14:27:25 info: **************************
heartbeat: 2008/11/19_14:27:25 info: Configuration validated. Starting heartbeat 1.2.5
heartbeat: 2008/11/19_14:27:25 info: heartbeat: version 1.2.5
heartbeat: 2008/11/19_14:27:26 info: Heartbeat generation: 44
heartbeat: 2008/11/19_14:27:26 info: ucast: write socket priority set to IPTOS_LOWDELAY on eth1
heartbeat: 2008/11/19_14:27:26 info: ucast: trying to bind: eth1

heartbeat: 2008/11/19_14:27:26 info: ucast: bound send socket to device: eth1
heartbeat: 2008/11/19_14:27:26 info: ucast: try binding receive socket to device: eth1
heartbeat: 2008/11/19_14:27:26 info: ucast: could bind receive socket to device: eth1:fe00a8c0.
heartbeat: 2008/11/19_14:27:26 info: ucast: started on port 694 interface eth1 to 192.168.0.253
heartbeat: 2008/11/19_14:27:26 notice: Using watchdog device: /dev/watchdog
heartbeat: 2008/11/19_14:27:26 info: pid 19899 locked in memory.
heartbeat: 2008/11/19_14:27:26 info: Local status now set to: 'up'
heartbeat: 2008/11/19_14:27:27 info: pid 19902 locked in memory.
heartbeat: 2008/11/19_14:27:27 info: pid 19903 locked in memory.
heartbeat: 2008/11/19_14:27:27 info: pid 19904 locked in memory.
heartbeat: 2008/11/19_14:27:27 info: Link camanager2:eth1 up.
heartbeat: 2008/11/19_14:27:27 info: Status update for node camanager2: status active
heartbeat: 2008/11/19_14:27:27 info: Local status now set to: 'active'
heartbeat: 2008/11/19_14:27:27 info: remote resource transition completed.
heartbeat: 2008/11/19_14:27:27 info: remote resource transition completed.
heartbeat: 2008/11/19_14:27:27 info: Local Resource acquisition completed. (none)
heartbeat: 2008/11/19_14:27:27 info: Initial resource acquisition complete (T_RESOURCES(them))
heartbeat: 2008/11/19_14:27:27 info: Running /etc/ha.d/rc.d/status status

Thursday, December 4, 2008

Essential Cisco NAC deployment tools

When deploying NAC there are a number of very useful tools that can help with implementing and troubleshooting. Here's a list of tools I've found useful


  • Wireshark - This is an open source network protocol analyzer that allows you to see exactly what traffic is going across the wire. There's a "Follow TCP Stream" feature that allows you to see the entire stream of traffic for a session. One place this can be used is when looking for certificate CRL information being sent from a client to a CA. You'll be able to see the exact URL that is being used in an easy to read manner.

  • LDAP Browser - This tool allows you to browse the LDAP tree to help determine what entries you should match on.

  • Kerbtray - This is a one tool in a set of Microsoft resource kit tools that is meant for Windows 2003, but also works for Windows XP. This tool provides information about Kerberos authentication. This is invaluable for troubleshooting AD SSO issues.

  • Camstudio - This is an open source video creation tool that you can use to create short video tutorials showing how NAC works. It can create an AVI or Flash file of your screen while you're demonstrating different NAC features. This can be a great tool for providing a visual representation of the NAC login process during end user training

  • Irfanview - This is a great tool for editing screenshots

Monday, November 10, 2008

Solution for Slow Cisco NAC WSUS Requirement Check

Slow NAC posture validation can be one of the biggest stumbling blocks for a successful NAC deployment. One of the biggest reasons for slow posture validation is the time it takes for the WSUS Requirement check. I've come up with a list of troubleshooting steps to try to reduce the time it takes for the WSUS Requirement checks

Troubleshooting Option 1: Use the Latest version of Windows Update Agent
The latest version of Windows Update Agent includes new features that speed up the WSUS check process. First, make sure that Windows Update Agent 3.0 release is being used on the client. Also, the KB927891 patch must be installed if you are running XP SP2. You can verify the version by looking at the version of the c:\WINDOWS\System32\wuaueng.dll file. The version should be 7.2.6001.784 as shown in the picture


Based on the links listed below, this Windows Update Agent release is backwards compatible with WSUS release 2.0.

Because of the major changes that have been made with the new Windows Update Agent, this troubleshooting step should be done before any other troubleshooting is done. In addition to the faster checks, this latest version includes a number of fixes that controls the CPU utilization. Below are two links explaining the changes
  • http://blogs.technet.com/wsus/archive/2007/04/28/update-on.aspx
  • http://blogs.technet.com/wsus/archive/2007/05/15/srvhost-msi-issue-follow-up.aspx
The first link actually starts off with the following statement
In addition to the next week’s WSUS 3.0 release, we are making the new client portion available via the following plan to our customers who continue to experience performance issues like UI hang and long scan times.
In one instance, I saw a 90 second scan time go down to 5 seconds. I used the PT (Protocol Tracker) lines of the c:\windows\WindowsUpdate.log file to verify this. Below are screenshots with long time with Windows Update Agent 2.0 followed by the short time after the Windows Update Agent 3.0 upgrade.


This first screen shows the version, start time and end time in bold. You'll notice that updates take 98 seconds to complete.


This second screen also shows the version, start time and end time in bold. You'll notice that updates takes 2 seconds to complete

Troubleshooting Option 2: Defragment datastore.edb
The c:\windows\SoftwareDistribution\DataStore\DataStore.edb file is a database file that stores the local information about Microsoft Updates. When the Windows Update Agent downloads the WSUS data store, it compares it with the local data store in the DataStore.edb database. I found the instruction for defragementing the database on a Microsoft Forum Link I've posted the relevant information below.
The detection scan hits DataStore.edb causing a buffer overflow.
One can run esentutl from a Command Prompt to defragment DataStore.edb
instead of deleting it in hopes that will resolve the issue -

esentutl /d %windir%\SoftwareDistribution\Datastore\datastore.edb

If that doesn't resolve the issue, attempt to Recover the file -

esentutl /r %windir%\SoftwareDistribution\Datastore\datastore.edb

[This command performs recovery, bringing all databases to a
consistent state]

The next to last resort is to attempt to Repair it -

esentutl /p %windir%\SoftwareDistribution\Datastore\datastore.edb

NOTE: MS recommends that if the system is imaged regularly that a new
system image be done after running ANY of the above operations

* On XP Home Edition, one must stop the Automatic Updates service PRIOR
to running the above. This wasn't the case when doing so on XP Pro *

Troubleshooting Option 3: Remove Corrupted Database
This troubleshooting step removes the database directory entirely. The downside of this solution is that you will lose any history of updates. I found this procedure on the following website: http://myitkb.net/category/windows-updates. I've posted the relevant information below
  1. type in net stop wuauserv and then hit
  2. then enter cd /d %windir%\SoftwareDistribution hit
  3. rd /s DataStore
  4. click Yes at prompt
  5. and then type in net start wuauserv and hit

**Note: On one machine I was testing with, I corrupted something that was required for Windows Update Agent to start. I used the commands from the on a web forum to fix the problem:
sc sdset bits "D:(A;;CCLCSWRPWPDTLOCRRC;;;SY)(A;;CCDCLCSWRPWPDTLOCRSDRCWDWO;;;BA)(A;;CCLCSWLOCRRC;;;AU)(A;;CCLCSWRPWPDTLOCRRC;;;PU)"

sc sdset wuauserv "D:(A;;CCLCSWRPWPDTLOCRRC;;;SY)(A;;CCDCLCSWRPWPDTLOCRSDRCWDWO;;;BA)(A;;CCLCSWLOCRRC;;;AU)(A;;CCLCSWRPWPDTLOCRRC;;;PU)"
Use at your own risk

Friday, November 7, 2008

Microsoft WSUS Guide for Cisco NAC deployments

Microsoft Windows Server Update Services (WSUS) provides a method for managing Microsoft updates for company computers. Within a company, there are one or more WSUS servers that gets updates from Microsoft. Computers, within the company network, check in with this WSUS server to get their Microsoft updates. There are a number of benefits for using a WSUS server. Some of the benefits are

  • Control when updates are installed - This allows companies to test updates before deploying them to the user community.
  • Lower internet bandwidth usage - Keep the bandwidth, used for downloading Microsoft updates, within the internal network. This would keep the internet connection from becoming overloaded by users downloading updates directly from Microsoft

From a security perspective, keeping current with the latest Microsoft updates is very important. Computers are vulnerable to attacks if they do not have they do not have the latest security updates installed. Cisco NAC can make sure computers have approved Microsoft updates by using a WSUS Requirement. This requirement uses the WSUS API, on the the end computer, to poll the WSUS server for an index of all approved Microsoft updates. The end computer then uses the local Windows Update Agent to compare the local index, called a data store, with the index received from the WSUS server. Any differences would cause the Cisco NAC remediation dialog box to appear and guide the end user through downloading and installing the Microsoft updates.

In theory this should be a seamless process that occurs quickly. In practice, there are a number of problems that can occur. Some common problems are problems connecting to the WSUS server and errors when connecting to the WSUS server. Below are some common tools to use for troubleshooting WSUS problems.

Common Troubleshooting Tools
  1. wuauclt.exe /detectnow - This is a great command to initiate detection of the WSUS server manually. Without this command you need to wait for the Automatic Update process to kick off.
  2. c:\WINDOWS\WindowsUpdate.log - This file provides invaluable logs regarding the status of the Windows update progress.
  3. esentutl.exe - This command is a database utility that can recover and repair the database used, on the end computer. The database is stored in c:\WINDOWS\SoftwareDistribution\DataStore\DataStore.edb
  4. WSUS Client Diagnostics Tool - This tool checks the basic settings required for WSUS to work. The link above provides access to the Microsoft website providing more information about the tool along with a link to download
Here's an example on how the first two tools would be used.

A user is having problems getting Microsoft updates from the WSUS server. You go to the users computer and check out the c:\WINDOWS\windowsupdate.log file. In the file, you notice the following error message
WARNING: WU client failed Searching for update with error 0x8024400e
You run "wuauclt.exe /detectnow" and check the windowsupdate.log file again to make sure the problem is still occurring. After verifying that it still occurring, you do a Google search on "error 0x8024400e" and find a link to a website describing a similar problem and offering a solution. You contact the WSUS team and have them implement the change to fix the problem.

While Google searches are excellent ways of obtaining information about WSUS, I've found a number of links to start your troubleshooting efforts with. Below are the best links I've found to start your research
  1. Main Microsoft WSUS Site including Configuration Guides
  2. Free Microsoft Support
  3. WSUS Wiki Site
  4. How to read the WindowsUpdate.log File
  5. Microsoft Blog about WSUS
  6. WSUS Forum
  7. WindowsUpdate Posts on Eggheadcafe.com
  8. Microsoft WSUS Discussion Group

Sunday, October 19, 2008

Basic Cisco wireless setup with Cisco Supportwiki

Cisco recently introduced wiki pages for their support. I've started using it as one more resource for research and general information. One link that I recently used was to setup basic wireless security for Cisco Access Point.

The instructions were fairly simple, but they got the job done. The instructions were as follows
To configure Wi-Fi Protected Access (WPA) on a Cisco Access Point (AP) without an authentication server, configure the AP with a pre-share key (WPA-PSK).

To configure the WPA-PSK, perform these steps using the GUI interface:

1. In the Encryption Manager window, select cipher TKIP and click Apply.
2. In the Service Set Identifier (SSID) Manager window, perform these steps:
  1. Create an SSID.
  2. Select Open Authentication.
  3. Set the Key Management to Mandatory.
  4. Check the WPA box.
  5. Enter a WPA-PSK and click Apply.
To check out the Cisco Support Wiki Site, go to http://supportwiki.cisco.com

Fun with boolean expressions in Cisco NAC Appliance Rules

This is an example of how to handle a tricky boolean expression with NAC Appliance rules. I'll lay out a scenario and focus on the rule creation. I'll cover the other aspects in future blog posts.

Let's set the stage. Say you're setting up NAC for remote access VPN users using L3 in-band virtual gateway. The VPN is handled by an ASA and authentication is handled by VPN SSO. There are three classes of users. The three classes of users each have a different VPN profile for connecting to the VPN. These profiles are provided with separate VPN pools on the ASA. This make separating the users into roles fairly straightforward as described below.

With VPN SSO, RADIUS accounting packets are sent from the ASA to the NAC Server (NAS). One RADIUS accounting attribute is called "Framed_IP". This attribute contains the VPN pool IP address of the user. This information is used to map the user into a particular role. In this scenario, the computers used by the remote access users also have registry keys that define which user class they are in.

Now comes the fun part. Most of the users have the correct VPN profile for their user class, but there are some users that have an incorrect VPN profile. We'll call the user classes CLASSA, CLASSB, and CLASSC. How do you block the users, using the incorrect profile, from the network and also provide them with the correct VPN profile?

From a 10,000 foot view, this is an easy task of completing the following steps
1. Define the unique registry keys for CLASSA, CLASSB, and CLASSC
2. Create checks for each registry key and value name
3. Create the rules based on the checks
4. Create the requirements
5. Tie the requirements to the rules to create requirement-rules
6. Tie the requirement-rules to the roles for the different classes of users(ie CLASSA, CLASSB, and CLASSC)

The devil is truly in the details. Steps 1 and 2 are fairly straightforward. Use regedit to find the registry keys and value names. Then create the checks in the NAC Manager (NAM) by navigating to "Device Management > Clean Access > Clean Access Agent > Rules".

Step 3 is where things get interesting. I'll take the rules for CLASSA as an example. The rules will be tied to requirements that ensure that the VPN user is using the correct VPN profile. I'll call the checks created for each class REGA, REGB, and REGC. I'll be creating two rules. Below is a sentence description of what the rules will accomplish

1. For CLASSA, if the registry key on the VPN user's computer matches REGA, then pass the rule. If the registry does not match REGA, but does match REGB, then fail the rule. On rule failure, the requirement tied to this rule will provide the user a download link to download the CLASSB VPN profile. If REGA does not exist and REGB does not exist, pass the rule. This will allow other users that do not match REGA and REGB to flow down to the next rule.
2. For CLASSA, if the registry key on the VPN user's computer matches REGA, then pass the rule. If the registry does not match REGA, but does match REGC, then fail the rule. On rule failure, the requirement tied to this rule will provide the user a download link to download the CLASSC VPN profile. If REGA does not exist and REGC does not exist, pass the rule. This will allow other users that do not match REGA and REGC to flow down to the next rule.
3. This rule blocks access if none of the REGA, REGB, or REGC checks were seen. This would be a catchall rule looking for rogue users trying to access the network with an unapproved computer.

I'll explain the boolean logic behind the first rule. The first thing to keep in mind is that the requirement-rule will only trigger on a failure of the rule. If the rule passes, the the requirement is deemed successful and no remediation is necessary. With that in mind, let's take a look at boolean logic for the first rule. The rule will actually be
REGA or (!REGA and !REGB)
Let's break this out into the individual parts.
Part1: REGA
The first REGA designates that if REGA is not found, then fail the rule.
Part2: (!REGA and !REGB)
The portion within the parentheses succeeds only if REGA is not found and REGB is not found. It is important to have the !REGA because we only want to Remember, what we are really looking for is failure scenarios. Failure occurs in three combinations:
  1. REGA is found and REGB not found
  2. REGA is found and REGB is found
  3. REGA is not found and REGB is found.
The first two combinations will never occur because they would already have passed the expression in Part1. The last combination is the one we want to fail. This means that the requirement remediation dialog box will only appear if REGA fails, but REGB is found.
REGA or (!REGA and !REGB)
Putting it all together, if REGA is found, then the rule will succeed and remediation is not necessary. The immediate success is because of the "or" boolean expression right after the REGA. If REGA is not found, then the second part of the expression, within the parentheses, is evaluated. If this expression succeeds it means that REGB was not found. In this case we're doing this because only want to provide remediation if REGB is found and REGA is not found.

Below is the matrix form of the rule. 1 indicates that the check evaluates true. 0 indicates that the check evaluates to false. In the REGA and REGB columns a 0 indicates that the registry key does not exist and a 1 indicates that the registry key does exist.






REGAREGBREGA or (!REGA and !REGB)Result
000 or (1 and 1)1 (pass)
010 or (1 and 0)0 (fail)
101 or (0 and 1)1 (pass)
111 or (0 and 0)1 (pass)


In summary, the most important thing to keep in mind is that you want the rule to fail in order to trigger the requirement remediation.

Sunday, September 21, 2008

Limiting Operating Systems Allowed Through Cisco NAC Appliance

In order to limit the operating systems that are allowed through the NAC Server, configure the "User Pages". "User Pages" are located at "Administration > User Pages", as seen below. This makes sense when the login page is used for user authentication. Users plug in their computers, try to access a webpage, and are then redirected to the login page for their operating system. If their operating system is not defined under "User Pages", then they are denied access to the network.



What is less obvious is how the "User Pages" affect single sign on (SSO) scenarios. With NAC Appliance, there are two widely used single sign methods. The first SSO method is VPN SSO. This is used mostly with remote VPN access where the VPN device sends the NAC Server a RADIUS accounting packet after successful authentication. This allows the NAC Server to accept sessions from the user as successfully authenticated. The second SSO method is AD SSO. This is used mostly for campus deployments. In this method, a user's AD login is recognized by the NAC Server using Kerberos tickets.

In both SSO methods, the login page is never displayed because authentication is handled by SSO. With this in mind, configuring the User Pages is not an intuitive step in the configuration process. In actuality, the User Pages are very important in the configuration of SSO. The "User Pages" still define which operating systems are allowed through the NAC Server. This means that, even if a user successfully completes SSO, they will not be allowed access, through the NAC Server, if their operating system is not defined in "User Pages". Thinking of it another way, this is still the recommended method of blocking unwanted operating systems even when using SSO.

Thursday, August 28, 2008

Cisco NAC Appliance and Wildcard SSL Certificates

The Cisco NAC Appliance 4.1.6 Server Configuration guide clearly states that wildcard SSL certificates are not supported. Below is the associated text that is also a link to the section in the guide
Cisco NAC Appliance does not support "wildcard" certificates.
What is not stated is exactly why this is the case. On the Miami of Ohio mailing list, Nate Austin, provided more detailed information about why wildcard certificates are not supported
Theres actually a valid reason. The client pulls the redirection information out of the certificate Common Name. So if the CN is *.domain.com, it will try to redirect you to that and obviously fail.

I have never personally tried it where the SAN in the cert was the cas name, so I don't know if we can pull the name from there as well, but my instinct says probably not.

Friday, August 15, 2008

Cisco NAC Appliance 4.1.6 upgrade notes

I'll start by saying that it is imperative to read the release notes, cover to cover, before doing the upgrade. There are a couple of problems I ran into with my first two NAC upgrades. Both problems revolved around one big change in 4.1.6. That change requires the communication between the NAS and NAM to provide mutual SSL certificate authentication. This means that the CA root certificate for the NAS SSL certificate needs to exist on the NAM and the the CA root certificate for the NAM SSL certificate needs to exist on the NAS. Previously, the NAM only authenticated the NAS SSL certificate so you only had to make sure that the CA root certificate for the NAS existed on the NAM. With this new requirement, you also now have to make sure that the NAS SSL certificate supports both SSL server and SSL client attributes. Chris Evans does a pretty good explaining this on his Miami of Ohio Mailing List entry.

The first big problem was that SSL certificates on the NAS and NAM must support SSL client and SSL server attributes. On the Miami of Ohio Mailing List, Rand talked about that issue. I ran into that issue with an Entrust Standard SSL certificate. It turns out that you have to purchase the Entrust Advantage SSL certifcate to get the SSL client and SSL server attribute functionality.

Here's what an SSL public certificate with only the SSL server attribute enabled looks like


Here's what an SSL public certificate with SSL server and SSL client attributes enabled looks like. This is what you want to see.


The second problem I ran into had to do with corruption of the SSL certificate when doing the upgrade. I had a Verisign certificate, which uses an intermediate root CA certificate, on the NAS. I made sure I added the root and intermediate CA certificate onto the NAM. When I did the upgrade the NAS and NAM wouldn't talk. In the NAS and NAM logs there were complaints about invalid chaining certificate. I checked the Trusted Certifcate Authority on the NAS and the NAM and made sure the intermediate and root CA Verisign certificate existed on both. I ended up solving the problem by re-inputting the private key and CA-Signed Certificate on the NAS. Once I did that and rebooted everything worked fine. I also saw in the 4.1.6 NAS config guide that the cacerts file can get corrupted. That may have been what happened during the upgrade. The config guide recommends the following
If you check nslookup and date from the CAS, and both the DNS and TIME settings on the CAS are correct, this can indicate that the cacerts file on the CAS is corrupted. In this case, Cisco recommends backing up the existing cacerts file from /usr/java/j2sdk1.4/lib/security/cacerts, overriding it with the file from /perfigo/common/conf/cacerts, then performing “service perfigo restart” on the CAS.

Monday, July 14, 2008

How to remote control a computer connected through a VPN client connection

I've run into numerous cases where a user is successfully connected via a Cisco VPN client and is having application problems. The helpdesk would like to get into the user's computer to diagnose the problem. Since they have a valid VPN tunnel, you'd think they'd just be able to remote desktop into the user's computer and take a look. Unfortunately, as soon as you remote desktop into their computer, you get a screen saying you'll have to kick them off. When you kick them off, you're also killing the VPN connection.

In order to get around that limitation, I know of two options. The first option is to have the user install VNC Server. This would allow the helpdesk to use a VNC client to remote into their computer. This option requires that the user has admin right on the computer in order to install VNC Server.

The second option is preferable because it does not require admin rights and is already built into Microsoft WinXP. The method involves using "Remote Assistance". The process is to
  1. Have a user create a file
  2. Email the file to the helpdesk
  3. Have the help desk download the file from the email
  4. Have the help desk double click on the file to open a connection to the users computer.

The remote assistance program is located at Start->All Programs->Remote Assistance, as seen below


This brings up the Remote Assistance wizard. The user should click on "Invite someone to help you"

In the next screen click Continue

The next screen gives the option of defining a password that the helpdesk has to input before being allowed remote access

Next the user saves the file

Finally, the user emails the file to the helpdesk. The helpdesk downloads the email attachment and double clicks on the file to launch it. After the file is launched, it opens a remote desktop session to the user's computer.

Sunday, July 13, 2008

Adding static routes for Cisco NAC Manager and Profiler

The NAC Manager and Profiler don't have a documented way of adding a static route, in addition to the default gateway. In most cases this is fine because all traffic follows the default gateway. When doing a pilot or setting up a lab environment there's a greater possibility of needing a static route to direct some traffic another direction than the default gateway.

Since the NAC products are built on Fedora Core, you can use the standard way of adding default routes within Fedora. Modifying the routing table requires root access, so make sure you are logged in as root or type "su -" to elevate to root privileges. Assuming eth0 is used for the traffic, you would create a file called "route-eth0" in the "/etc/sysconfig/network-scripts" directory. Here's an example of the contents of the file assuming you want to route the 192.168.0.0/16 subnet to 10.1.1.110

GATEWAY0=10.1.1.110
NETMASK0=255.255.0.0
ADDRESS0=192.168.0.0

As you probably figured out, you can add additional entries for GATEWAY1, NETMASK1, and ADDRESS1 to add additional static routes.

Once you've created this file, you can apply it in one of two ways. The safest way is to reboot the device with "shutdown -r now". The second way is to just restart the routing process with "service network restart"

Tuesday, July 8, 2008

Ports required for AD SSO

When configuring NAC for AD SSO, the last place you'd think to look would be the documentation, right? We'll, this would be one time that it makes a lot of sense to RTFM. On page 9-7 of the PDF version of the 4.1.3 Clean Access Server Installation and Configuration Guide they have all the ports required for AD SSO.

Here are the TCP ports required, in the unauthenticated role, for AD SSO to work: 88, 135, 389, 445, 1025, and 1026.

The one thing that isn't listed in the documentation is that ICMP is also required. Part of the login process includes trying to ping the AD server. If this fails, then AD login doesn't work

Monday, July 7, 2008

CSA Basic Building Blocks

CSA is a very powerful tool to enforce the security policy for a company. It has a very structured approach to create a security policy that is enforced through the CSA Agents. In order to optimize its use, it is important to understand the fundamental building blocks involved with turning the written security policy into an actionable enforcement tool.

I view the building blocks in two separate parts. The first part is creating the actions that will be used to enforce the security policy. The second part is defining the different types of computers, such as desktops and servers, that have the same type of characteristics. Once these two parts are created, they are linked together so that the correct actions are linked to the appropriate types of computers.

The first part involves creating three objects that build upon each other: rules, rule modules, and policies. The first object is called a rule. This is the basic if/then action that determines enforcement. An example would be, "if an application tries to open a cmd.exe shell, deny and log the access". In addition to denying access, there are a number of different actions that can be taken. The diagram below shows the different actions available. The diagram is important, because, in many places within CSA, the icons associated with the actions are shown without the actual names.




The second object is called a rule module. A rule module combines multiple rules together that all pertain to the same operating system and provide the same type of functionality. Rule modules are then combined into a policy. The policy should contain all aspects that cover the security policy for a particular group of computers (ie desktops or servers). Unlike the rule modules, the policies are not restricted to pertaining to a single operating system. That completes the first part.

The second part is defining the types of computers. CSA calls these groups. These groups break up the computers based on operating system and other logical criteria such as function and business group. Additionally, CSA parameters, such as polling interval, alerts, and events, can be defined for the group instead of for individual hosts.

The last step is to tie the policies created to the groups. This creates an enforceable security policy for the different types of computers in the network.

Wednesday, June 25, 2008

AVG 8.0 will be fully supported in NAC 4.1.6

Currently NAC only supports installation checks for the paid version of AVG 8.0. The free version and definition file checks will be supported in version 4.1.6. From what I've been told, this version should be coming out sometime in July.

Tuesday, June 17, 2008

Resetting NAC Manager database

I've been writing some NAC labs and I wanted to figure out the best way to clear out the database and start from scratch. I found the instructions in the /perfigo/dbscripts/README file on the NAC Manager. Here are the relevant commands to clear out the database and start from scratch

To remove perfigo database issue:
-----------------------------
su -l postgres -c "psql -h 127.0.0.1 -p 5432 controlsmartdb < /perfigo/dbscripts/pg_droptable.sql"
su -l postgres -c "dropdb -h 127.0.0.1 -p 5432 controlsmartdb"

To install perfigo database issue:
-----------------------------
su -l postgres -c "createdb -h 127.0.0.1 -p 5432 controlsmartdb"
su -l postgres -c "psql -h 127.0.0.1 -p 5432 controlsmartdb < /perfigo/dbscripts/pg_createtable.sql"
*Note: Running the commands will remove the license file as well, so make sure you have the NAC Manager and Server license files before running the commands

Saturday, June 14, 2008

Solution to slow CAM login

I just saw this in the 4.1(1) release notes. It's resolved caveat CSCsi23228. I haven't had to use it but it may be useful someday if I run into slow CAM login time
http://www.cisco.com/en/US/docs/security/nac/appliance/release_notes/411/411rn.html

CAM database performance degraded over time

Clean Access Manager performance degrades over time, users may experience slowness during login process and CAM web administration interfaces. The slowness may start to exhibit itself after an extensive number of database delete/insert/modify operations.

There are three workarounds for this issue which can be applied under different conditions.

Workaround 1

This can be applied during maintenance window when CAM is not in service. Note that this may take up several minutes, please do not interrupt the process.

1. service perfigo stop
2. su -l postgres
3. vacuumdb -h 127.0.0.1 -a -f
4. exit
5. service postgresql restart
6. service perfigo start

Workaround 2

This can be applied when system is in service with light load. Note that this may take up several minutes, please do not interrupt the process.
1. su -l postgres
2. vacuumdb -h 127.0.0.1 -a -f
3. exit

Workaround 3: This can be added as system daily cron job to prevent the potential slowness.

1. Create a file named "db_vacuum.sh" under "/etc/cron.daily" with the following content:
#!/bin/sh
su - postgres -c "vacuumdb -h 127.0.0.1 -a -f"
2. cd /etc/cron.daily
3. chmod +x db_vacuum.sh

Friday, June 13, 2008

DMVPN with NAT

It looks like Cisco has been fixing NAT issues with DMVPN. They fixed the NAT issue for spokes talking to the hub using NAT traversal. This is the same method that VPN clients use. It uses UDP port 4500 to send the IPSec traffic instead of IP protocol 50 (ESP) and IP protocol 51 (AH). Here's a link with more explanation.
http://www.cisco.com/en/US/docs/ios/security/configuration/guide/dmvpn_dt_spokes_b_nat.html

In versions after 12.4(6)T, the spoke-to-spoke traffic with NAT is supported. Take a look at this link for more information.
http://www.cisco.com/en/US/docs/ios/12_2t/12_2t13/feature/guide/ftgreips.html#wp1039515

Here's the important information from the link
In Cisco IOS Release 12.4(6)T or earlier, DMVPN spokes behind NAT will not participate in dynamic direct spoke-to-spoke tunnels. Any traffic to or from a spoke that is behind NAT will be forwarded using the DMVPN hub routers. DMVPN spokes that are not behind NAT in the same DMVPN network may create dynamic direct spoke-to-spoke tunnels between each other.

In Cisco IOS Release 12.4(6)T or later releases, DMVPN spokes behind NAT will participate in dynamic direct spoke-to-spoke tunnels. The spokes must be behind NAT boxes that are preforming NAT, not PAT. The NAT box must translate the spoke to the same outside NAT IP address for the spoke-spoke connections as the NAT box does for the spoke-hub connection. If there is more than one DMVPN spoke behind the same NAT box, then the NAT box must translate the DMVPN spokes to different outside NAT IP addresses. It is also likely that you may not be able to build a direct spoke-spoke tunnel between these spokes. If a spoke-spoke tunnel fails to form, then the spoke-spoke packets will continue to be forwarded via the spoke-hub-spoke path.

I tried this out in a Dynamips lab and it worked great.

Here's a diagram of the dynagen lab I created with the relevant config

Wednesday, June 11, 2008

How does Cisco NAC change your DHCP IP

When implementing NAC you may wonder how it changes your IP when you move back and forth betwen the untrusted and trusted VLANs. Back in the olden days, the only way to do this was to bounce the switch port. This caused the link to go down on the connected computer which kicked off a new DHCP request. Nowadays there's a method that works better when the switch port has an IP phone and a computer on the same