Pricing Login
Pricing
Christian Beedgen

Christian Beedgen

As co-founder and CTO of Sumo Logic, Christian Beedgen brings 18 years experience creating industry-leading enterprise software products. Since 2010 he has been focused on building Sumo Logic’s multi-tenant, cloud-native machine data analytics platform which is widely used today by more than 1,600 customers and 50,000 users. Prior to Sumo Logic, Christian was an early engineer, engineering director and chief architect at ArcSight, contributing to ArcSight’s SIEM and log management solutions.

Posts by Christian Beedgen

Blog

Sumo Logic Recognized as Data Analytics Solution of the Year Showcasing the Power of Continuous Intelligence

Blog

Love In The Time Of Coronavirus

Blog

How We Understand Monitoring

Blog

All The Logs For All The Intelligence

Blog

Service Levels––I Want To Buy A Vowel

Blog

See You in September at Illuminate!

Blog

The Super Bowl of the Cloud

Blog

Platforms All The Way Up & Down

Blog

Microservices for Startups Explained

Blog

Machine Data for the Masses

Blog

Update On Logging With Docker

A Simpler & Better WayIn New Docker Logging Drivers, I previously described how to use the new Syslog logging driver introduced in Docker 1.6 to transport container logs to Sumo Logic.Since then, there have been improvements to the Syslog logging driver, which now allows users to specify the address of the Syslog server to send the logs to. In its initial release the Syslog logging driver simply logged to the local Syslog daemon, but this is now configurable. We can exploit this in conjunction with the Sumo Logic Collector container for Syslog to make logging with Docker and Sumo Logic even easier.Simply run the Syslog Collector container as previously described:$ docker run -d -p 514:514 -p 514:514/udp \ --name="sumo-logic-collector" \ sumologic/collector:latest-syslog \ [Access ID] [Access key]Now you have a collector running, listening for Syslog on both ports 514/tcp and 514/udp.For every container required to run on the same host, you can now add the following to the Docker run command in order to make the container log to your Syslog collector:--log-driver syslog --log-opt syslog-address=udp://localhost:514Or, in a complete example:$ docker run --rm --name test \ --log-driver syslog --log-opt syslog-address=udp://localhost:514 \ ubuntu \ bash -c 'for i in `seq 1 10`; do echo Hello $i; sleep 1; done'You should now see something along these lines in Sumo Logic:This, of course, works remotely, as well. You can run the Sumo Logic Collector on one host, and have containers on all other hosts log to it by setting the syslog address accordingly when running the container.And Here Is An ErrataIn New Docker Logging Drivers, I described the newly added logging drivers in Docker 1.6. At the time, Docker was only able to log to local syslog, and hence our recommendation for integration was as follows:$ docker run -v /var/log/syslog:/syslog -d \ --name="sumo-logic-collector" \ sumologic/collector:latest-logging-driver-syslog \ [Access ID] [Access Key]This will basically have the Sumo Logic Collector tail the OS /var/log/syslog file. We discovered in the meantime that this will cause issues if /var/log/syslog is being logrotate’d. The container will hang on to the original file into which Syslog initially wrote the messages, and not pick up the new file after the old file was moved out of the way.There’s a simple solution to the issue: mount the directory into the container, not the file. In other words, please do this:$ docker pull sumologic/collector:latest-logging-driver-syslog$ docker run -v /var/log:/syslog -d \ --name="sumo-logic-collector" \ sumologic/collector:latest-logging-driver-syslog \ [Access ID] [Access Key]Or, of course, switch to the above described new and improved approach!

Blog

Comprehensive Monitoring For Docker - More Than "Just" Logs

Today I am happy to be able to talk about something that has been spooking around in my head for the last six months or so. I've been thinking about this ever since we started looking into Docker and how it applies to what we are doing here at Sumo. There are many different and totally valid ways to get logs and statistics out of Docker. Options are great, but I have concluded that the ultimate goal should be a solution that doesn't require users to have in-depth knowledge about all the things that are available for monitoring and the various methods to get to them. Instead, I want something that just pulls all the monitoring data out of the containers and Docker daemons with minimal user effort. In my head, I have been calling this "a comprehensive solution". Let me introduce you to the components that I think need to be part of a comprehensive monitoring solution for Docker: Docker events, to track container lifecycles Configuration info on containers Logs, naturally Statistics on the host and the containers Other host stuff (daemon logs, host logs, ...) Events Let's start with events. The Docker API makes it trivial to subscribe to the event stream. Events contain lots of interesting information. The full list is well described in the Docker API doc, but let’s just say you can track containers come and go, as well as observe containers getting killed, and other interesting stuff, such as out of memory situations. Docker has consistently added new events with every version, so this is a gift that will keep on giving in the future. I think of Docker events as nothing but logs. And they are very nicely structured—it's all just JSON. If, for example, I can load this into my log aggregation solution, I can now track which container is running where. I can also track trends - for example, which images are run in the first place, and how often are they being run. Or, why are suddenly 10x more containers started in this period vs. before, and so on. This probably doesn't matter much for personal development, but once you have fleets, this is a super juicy source of insight. Lifecycle tracking for all your containers will matter a lot. Configurations Docker events, among other things, allow us to see containers come and go. What if we wanted also to track the configurations of those containers? Maybe we want to track drift of run parameters, such as volume settings, or capabilities and limits. The container image is immutable, but what about the invocation? Having detailed records of container starting configurations in my mind is another piece of the puzzle towards solving total visibility. Orchestration solutions will provide those settings, sure, but who is telling those solutions what to do? From our own experience, we know that deployment configurations are inevitably going to be drifting, and we have found the root cause to otherwise inscrutable problems there more than once. Docker allows us to use the inspect API to get the container configuration. Again, in my mental model, that's just a log. Send it to your aggregator. Alert on deviations, use the data after the fact for troubleshooting. Docker provides this info in a clean and convenient format. Logs Well, obviously, it would be great to have logs, right? Turns out there are many different ways to deal with logs in Docker, and new options are being enabled by the new log driver API. Not everybody is quite there yet in 12-factor land, but the again there are workarounds for when you need fat containers and you need to collect logs from files inside of containers. More and more I see people following the best practice of writing logs to standard out and standard error, and it is pretty straightforward to grab those logs from the logs API and forward them from there. The Logspout approach, for example, is really neat. It uses the event API to watch which containers get started, then turns around and attaches to the log endpoint, and then pumps the logs somewhere. Easy and complete, and you have all the logs in one place for troubleshooting, analytics, and alerting. Stats Since the release of Docker 1.5, container-level statistics are exposed via a new API. Now you can alert on the "throttled_data" information, for example - how about that? Again (and at this point, this is getting repetitive, perhaps), this data should be sucked into a centralized system. Ideally, this is the same system that already has the events, the configurations, and the logs! Logs can be correlated with the metrics and events. Now, this is how I think we are getting to a comprehensive solution. There are many pieces to the puzzle, but all of this data can be extracted from Docker pretty easily today already. I am sure as we all keep learning more about this it will get even easier and more efficient. Host Stuff In all the excitement around APIs for monitoring data, let's not forget that we also need to have host level visibility. A comprehensive solution should therefore also work hard to get the Docker daemon logs, and provide a way to get any other system level logs that factor into the way Docker is being put to use on the hosts of the fleet. Add host level statistics to this and now performance issues can be understood in a holistic fashion - on a container basis, but also related to how the host is doing. Maybe there's some intricate interplay between containers based on placement that pops up on one host but not the other? Without quick access to the actual data, you will scratch your head all day. User Experience What's the desirable user experience for a comprehensive monitoring solution for Docker? I think it needs to be brain-dead easy. Thanks to the API-based approach that allows us to get to all the data either locally or remotely, it should be easy to encapsulate all the monitoring data acquisition and forwarding into a container that can either run remotely, if the Docker daemons support remote access, or as a system container on every host. Depending on how the emerging orchestration solutions approach this, it might not even be too crazy to assume that the collection container could simply attach to a master daemon. It seems Docker Swarm might make this possible. Super simple, just add the URL to the collector config and go. I really like the idea of being able to do all of this through the API because now I don't need to introduce other requirements on the hosts. Do they have Syslog? JournalD? Those are of course all great tools, but as the levels of abstractions keep rising, we will less and less be able to make assumptions about the hosts. So the API-based access provides decoupling and allows for composition. All For One So, to be completely honest, there's a little bit more going on here on our end than just thinking about this solution. We have started to implement almost all of the ideas into a native Sumo Logic collection Source for Docker. We are not ready to make it generally available just yet, but we will be showing it off next week at DockerCon (along with another really cool thing I am not going to talk about here). Email [email protected] to get access to a beta version of the Sumo Logic collection Source for Docker.

Blog

The Power of 5

Five years, five rounds of financing, five hundred customers already and 500 Sumo employees down the road. And there’s another 5 hidden in this story which you will have to puzzle out yourself. We welcome our new investors Draper Fisher Jurvetson Growth and Institutional Venture Partners, as well as Glynn Capital and Tenaya Capital. And we say thank you for the continued support of the people and the firms that have added so much value while fueling our journey: Greylock, Sutter Hill Ventures, Accel Partners, and Sequoia. It is fair to say that we were confident in the beginning that the hypotheses on which Sumo Logic was founded are fundamentally solid. But living through the last 5 years, and seeing what the people in this company have accomplished to build on top of this foundation is truly breathtaking and fills me with great pride. For us, the last five years have been a time of continuous scaling. And yet we managed to stay true to our vision – to make machine data useful with the best service we can possibly provide. We have become experts at using the power of the scalability that’s on tap in our backend to relentlessly crunch through data. Our customers are telling us that this again and again surfaces the right insights that help them understand their application and security infrastructures. And with our unique machine learning capabilities, we can turn outliers and anomalies into those little “tap tap tap”-on-your-shoulder moments that make the unknown known and that truly turn data into gold. One of the (many) things that is striking to me when looking back over the last 5 years is just how much I appreciate the difference between building software and building a service. They will have to drag me back kicking and screaming to build a product as a bunch of code to be shipped to customers. That’s right, I am a recovering enterprise software developer. We had a hunch that there must be a better way, and boy were we right. Choosing to build Sumo Logic as a service was a very fundamental decision – we never wanted to ever again be in a situation in which we were unable to observe how our product was being used. As a service, we have the ultimate visibility, and seeing and synthesizing what our customers are doing continuously helps to support our growth. At the same time, we have nearly total control over the execution environment in which our product operates. This is enlightening for us as engineers because it removes the guesswork when bug reports are coming in. No longer do I have to silently suspect that maybe it is related to that old version of Solaris that the customer insists on using to run my product on. And no, I don’t want to educate you which RAID level you need to run the database supporting my software on anymore, because if you don’t believe me, we are both going to be in a world of hurt 6 months down the road when everything grinds to a halt. I simply don’t want to talk anymore about you having to learn to run and administer my product. Our commitment and value is simple: let me do it for you, so you can focus on using our service and getting more value. Give us the control to run it right and all will benefit. Obviously, we are not alone in having realized the many benefits of software as a service – SaaS. This is why the trend to convert all software to services has only started. Software is eating software, quite literally. I see it every day when we replace legacy systems. We are ourselves exclusively consuming services at Sumo Logic – we have no data center. We literally have just one Linksys router sitting alone and lonely in the corner of our office, tying the wireless access points to some fiber coming out of the floor. That’s it. Everything else is a service. We believe this is a better way to live, and we put our money where our mouth is, supporting our fellow product companies that have gone the service route. So in many ways we are all riding the same wave, the big mega trend – a trend that is based on efficiency and a possibility of doing things in a truly better way. And we have the opportunity to both be like and behave like our customers, while actually helping our customers build these great new forward looking systems. At Sumo Logic, we have created a purpose-built cloud analytics service that supports, and is needed, by every software development shop over the next number of years as more and more products are built on the new extreme architecture. Those who have adopted and are adopting the new way of doing things are on board already and we are looking forward to support the next waves by continuing to provide the best service to monitor, troubleshoot, and proactively maintain the quality of your applications, infrastructure, and ultimately of your service. In addition, with our unique and patented machine learning analytics capabilities we can further deliver on our vision to bring machine intelligence to the masses where as this was previously only available to the fortunate few. As we scale along with the great opportunity that the massive wave of change in IT and software is bringing, we will put the money provided by our investors to the best possible use we can think of. First of all, we will continue to bring more engineers and product development talent on board. The addition of this new tech talent will continue to help us further develop our massive elastic scale platform which has grown more than 1000X in the past few years in terms of data ingested. In fact, we are already processing 50TB of new data every day, and that number will only go up. Our own production footprint has reached a point where we would literally have to invent a product like Sumo Logic in order to keep up – thankfully, we enjoy eating our dog food, all across the company. Except for the dogs in the office, they’d actually much rather have more human food. In any case, this service is engineering heavy, full of challenges along many dimensions, and scale is just one of them. If you are looking for a hardcore technical challenge, let’s talk ([email protected]). And while we continue to tweak our system and adhere to our SLAs (even for queries!), we will also massively grow the sales, G&A, marketing and customer success side of the company to bring what we believe to be the best purpose built cloud service for monitoring modern application architectures to more and more people, and to constantly improve on our mission of maniacal customer success. What do you say? Five more years, everybody!!!

June 1, 2015

Blog

Collecting In-Container Log Files

Docker and the use of containers is spreading like wildfire. In a Docker-ized environment, certain legacy practices and approaches are being challenged. Centralized logging is the one of them. The most popular way of capturing logs coming from a container is to setup the containerized process such that it logs to stdout. Docker then spools this to disk, from where it can be collected. This is great for many use cases. We have of course blogged about this multiple times already. If the topic fascinates you, also checkout a presentation I did in December at the Docker NYC meetup. At the same time, at Sumo Logic our customers are telling us that the stdout approach doesn’t always work. Not all containers are setup to follow the process-per-container model. This is sometimes referred to as “fat” containers. There are tons of opinions about whether this is the right thing to do or not. Pragmatically speaking, it is a reality for some users. What if you could visualize your entire Docker ecosystem in real-time? See how Sumo Logic makes it possible and get started for free today.Free Trial Even some programs that are otherwise easily containerized as single processes pose some challenges to the stdout model. For example, popular web servers write at least two log files: access and error logs. There are of course workarounds to map this back to a single stdout stream. But ultimately there’s only so much multiplexing that can be done before the demuxing operation becomes too painful. A Powerstrip for Logfiles Powerstrip-Logfiles presents a proof of concept towards easily centralizing log files from within a container. Simply setting LOGS=/var/logs/nginx in the container environment, for example, will use a bind mount to make the Nginx access and error logs available on the host under /var/logs/container-logfiles/containers/[ID of the Nginx container]/var/log/nginx. A file-based log collector can now simply be configured to recursively collect from /var/logs/container-logfiles/containers and will pick up logs from any container configured with the LOGS environment variable. Powerstrip-Logfiles is based on the Powerstrip project by ClusterHQ, which is meant to provide a way to prototype extensions to Docker. Powerstrip is essentially a proxy for the Docker API. Prototypical extensions can hook Docker API calls and do whatever work they need to perform. The idea is to allow for extensions to Docker to be composable – for example, to add support for overlay networks such as Weave and for storage managers such as Flocker. Steps to run Powerstrip-Logfiles Given that the Powerstrip infrastructure is meant to support prototyping of what one day will hopefully become Docker extensions, there’s still a couple of steps required to get this to work. First of all, you need to start a container that contains the powerstrip-logfiles logic: $ docker run --privileged -it --rm \ --name powerstrip-logfiles \ --expose 80 -v /var/log/container-logfiles:/var/log/container-logfiles \ -v /var/run/docker.sock:/var/run/docker.sock \ raychaser/powerstrip-logfiles:latest \ -v --root /var/log/container-logfiles Next you need to create a Powerstrip configuration file… $ mkdir -p ~/powerstrip-demo $ cat > ~/powerstrip-demo/adapters.yml <<EOF endpoints: "POST /*/containers/create": pre: [logfiles] post: [logfiles] adapters: logfiles: http://logfiles/v1/extension EOF …and then you can start the powerstrip container that acts as the Docker API proxy: $ docker run -d --name powerstrip \ -v /var/run/docker.sock:/var/run/docker.sock \ -v ~/powerstrip-demo/adapters.yml:/etc/powerstrip/adapters.yml \ --link powerstrip-logfiles:logfiles \ -p 2375:2375 \ clusterhq/powerstrip Now you can use the normal docker client to run containers. First you must export the DOCKER_HOST variable to point at the powerstrip server: $ export DOCKER_HOST=tcp://127.0.0.1:2375 Now you can specify as part of the container’s environment which paths are supposed to be considered logfile paths. Those paths will be bind-mounted to appear under the location of the –root specified when running the powerstrip-logfiles container. $ docker run --cidfile=cid.txt --rm -e "LOGS=/x,/y" ubuntu \ bash -c 'touch /x/foo; ls -la /x; touch /y/bar; ls -la /y' You should now be able to see the files “foo” and “bar” under the path specified as the –root: $ CID=$(cat cid.txt) $ ls /var/log/container-logfiles/containers/$CID/x $ ls /var/log/container-logfiles/containers/$CID/y See the example in the next section on how to most easily hook up a Sumo Logic Collector. Sending Access And Error Logs From An Nginx Container To Sumo Logix For this example, you can just run Nginx from a toy image off of Docker Hub: $ CID=$(DOCKER_HOST=localhost:2375 docker run -d --name nginx-example-powerstrip -p 80:80 -e LOGS=/var/log/nginx raychaser/powerstrip-logfiles:latest-nginx-example) && echo $CID You should now be able to see the Nginx container’s /var under the host’s /var/log/container-logfiles/containers/$CID/: $ ls -la /var/log/container-logfiles/containers/$CID/ And if you tail the access log from that location while hitting http://localhost you should see the hits being logged: $ tail -F /var/log/container-logfiles/containers/$CID/var/log/nginx/access.log Now all that’s left is to hook up a Sumo Logic collector to the /var/log/container-logfiles/containers/ directory, and all the logs will come to your Sumo Logic account: $ docker run -v /var/log/container-logfiles:/var/log/container-logfiles -d \ --name="sumo-logic-collector" sumologic/collector:latest-powerstrip [Access ID] [Access Key] This collector is pre-configured to collect all files from /container-logfiles which by way of the -v volume mapping in the invocation above is mapped to /var/log/container-logs/containers, which is where powerstrip-logfiles by default writes the logs for the in-container files. As a Sumo Logic user, it is very easy to generate the required access key by going to the Preferences page. Once the collector is running, you can search for _sourceCategory=collector-container in the Sumo Logic UI and you should see the toy Nginx logs. Simplify using Docker Compose And just because we can, here’s how this could all work with Docker Compose. Docker Compose will allow us to write a single spec file that contains all the details on how the Powerstrip container, powerstrip-logfiles, and the Sumo Logic collector container are to be run. The spec is a simple YAML file: powerstriplogfiles: image: raychaser/powerstrip-logfiles:latest ports: - 80 volumes: - /var/log/container-logfiles:/var/log/container-logfiles - /var/run/docker.sock:/var/run/docker.sock environment: ROOT: /var/log/container-logfiles VERBOSE: true entrypoint: - node - index.js powerstrip: image: clusterhq/powerstrip:latest ports: - "2375:2375" volumes: - /var/run/docker.sock:/var/run/docker.sock - ~/powerstrip-demo/adapters.yml:/etc/powerstrip/adapters.yml links: - "powerstriplogfiles: logfiles" sumologiccollector: image: sumologic/collector:latest-powerstrip volumes: - "/var/log/container-logfiles:/var/log/container-logfiles" env_file: .env You can copy and paste this into a file called docker-compose.yml, or take it from the powerstrip-logfiles Github repo. Since the Sumo Logic Collector will require valid credentials to log into the service, we need to put those somewhere so Docker Compose can wire them into the container. This can be accomplished by putting them into the file .env in the same directory, something like so: SUMO_ACCESS_ID=[Access ID] SUMO_ACCESS_KEY=[Access Key] This is not a great way to deal with credentials. Powerstrip in general is not production ready, so please keep in mind to try this only outside of a production setup, and make sure to delete the access ID and access key in the Sumo Logic UI. Then simply run, in the same directory as docker-compose.yml, the following: $ docker-compose up This will start all three required containers and start streaming logs to Sumo Logic. Have fun!

Blog

New Docker Logging Drivers

Docker Release 1.6 introduces the notion of a logging driver. This is a very cool capability and a huge step forward in creating a comprehensive approach to logging in Docker environments.It is now possible to route container output (stdout and stderr) to syslog. It is also possible to completely suppress the writing of container output to file, which can help in situations where disk space usage is of importance. This post will also show how easy it is to integrate the syslog logging driver with Sumo Logic.Let’s review for a second. Docker has been supporting logging of a container’s standard output and standard error streams to file for a while. You can see how this works in this quick example:<pre class="brush: plain; title: ; notranslate" title="">$ CID=$(docker run -d ubuntu echo "Hello")$ echo $CID5594248e11b7d4d40cfec4737c7e4b7577fe1e665cf033439522fbf4f9c4e2d5$ sudo cat /var/lib/docker/containers/$CID/$CID-json.log{"log":"Hello\n","stream":"stdout","time":"2015-03-30T00:34:58.782658342Z"}</pre>What happened here? Our container simply outputs Hello. This output will go to the standard output of the container. By default, Docker will write the output wrapped into JSON into a specific file named after the container ID, in a directory under /var/lib/docker/containers named after the container ID.Logging the Container Output to SyslogWith the new logging drivers capability, it is possible to select the logging behavior when running a container. In addition to the default json-file driver, there is now also a syslog driver supported. To see this in action, do this in one terminal window:<pre class="brush: plain; title: ; notranslate" title="">$ tail -F /var/log/syslog</pre>Then, in another terminal window, do this:<pre class="brush: plain; title: ; notranslate" title="">$ docker run -d --log-driver=syslog ubuntu echo "Hello"</pre>When running the container, you should see something along these lines in the tailed syslog file:Mar 29 17:39:01 dev1 docker[116314]: 0e5b67244c00: HelloCool! Based on the --log-driver flag, which is set to syslog here, syslog received a message from the Docker daemon, which includes the container ID (well, the first 12 characters anyways), plus the actual output of the container. In this case of course, the output was just a simple message. To generate more messages, something like this will do the trick:<pre class="brush: plain; title: ; notranslate" title="">$ docker run -t -d --log-driver=syslog ubuntu \ /bin/bash -c 'while true; do echo "Hello $(date)"; sleep 1; done'</pre>While still tailing the syslog file, a new log message should appear every minute.Completely Suppressing the Container OutputNotably, when the logging driver is set to syslog, Docker sends the container output only to syslog, and not to file. This helps in managing disk space. Docker’s default behavior of writing container output to file can cause pain in managing disk space on the host. If a lot of containers are running on the host, and logging to standard out and standard error are used (as recommended for containerized apps) then some sort of space management for those files has to be bolted on, or the host eventually runs out of disk space. This is obviously not great. But now, there is also a none option for the logging driver, which will essentially dev-null the container output.<pre class="brush: plain; title: ; notranslate" title="">$ CID=$(docker run -d --log-driver=none ubuntu \ /bin/bash -c 'while true; do echo "Hello"; sleep 1; done')$ sudo cat /var/lib/docker/containers/$CID/$CID-json.logcat: /var/lib/docker/containers/52c646fc0d284c6bbcad48d7b81132cb7ba03c04e9978244fdc4bcfcbf98c6e4/52c646fc0d284c6bbcad48d7b81132cb7ba03c04e9978244fdc4bcfcbf98c6e4-json.log: No such file or directory</pre>However, this will also disable the Logs API, so the docker logs CLI will also not work anymore, and neither will the /logs API endpoint. This means that if you are using for example Logspout to ship logs off the Docker host, you will still have to use the default json-file option.Integrating the Sumo Logic Collector With the New Syslog Logging DriverIn a previous blog, we described how to use the Sumo Logic Collector images to get container logs to Sumo Logic. We have prepared an image that extends the framework developed in the previous post. You can get all the logs into Sumo Logic by running with the syslog logging driver and running the Sumo Logic Collector on the host:<pre class="brush: plain; title: ; notranslate" title="">$ docker run -v /var/log/syslog:/syslog -d \ --name="sumo-logic-collector" \ sumologic/collector:latest-logging-driver-syslog \ [Access ID] [Access Key]</pre>As a Sumo Logic user, it is very easy to generate the required access key by going to the Preferences page. And that’s it folks. Select the syslog logging driver, and add the Sumo Logic Collector container to your hosts, and all the container logs will go into one place for analysis and troubleshooting.

Blog

An Official Docker Image For The Sumo Logic Collector

Note: This post is now superceded by Update On Logging With Docker.Learning By Listening, And DoingOver the last couple of months, we have spent a lot of time learning about Docker, the distributed application delivery platform that is taking the world by storm. We have started looking into how we can best leverage Docker for our own service. And of course, we have spent a lot of time talking to our customers. We have so far learned a lot by listening to them describe how they deal with logging in a containerized environment.We actually have already re-blogged how Caleb, one of our customers, is Adding Sumo Logic To A Dockerized App. Our very own Dwayne Hoover has written about Four Ways to Collect Docker Logs in Sumo Logic.Along the way, it has become obvious that it makes sense for us to provide an “official” image for the Sumo Collector. Sumo Logic exposes an easy to use HTTP API, but the vast majority of our customers are leveraging our Collector software as a trusted, production-grade data collection conduit. We are and will continue to be excited about folks building their own images for their own custom purposes. Yet, the questions we get make it clear that we should release an official Sumo Logic Collector image for use in a containerized worldInstant Gratification, With Batteries IncludedA common way to integrate logging with containers is to use Syslog. This has been discussed before in various places all over the internet. If you can direct all your logs to Syslog, we now have a Sumo Logic Syslog Collector image that will get you up and running immediately:docker run -d -p 514:514 -p 514:514/udp --name="sumo-logic-collector"sumologic/collector:latest-syslog [Access ID] [Access key]Started this way, the default Syslog port 514 is mapped port on the host. To test whether everything is working well, use telnet on the host:<pre class="brush: plain; title: ; notranslate">telnet localhost 514</pre>Then type some text, hit return, and then CTRL-] to close the connection, and enter quit to exittelnet. After a few moments, what you type should show up in the Sumo Logic service. Use a search to find the message(s).To test the UDP listener, on the host, use Netcat, along the lines of:<pre class="brush: plain; title: ; notranslate">I'm in ur sysloggz | nc -v -u -w 0 localhost 514</pre>And again, the message should show up on the Sumo Logic end when searched for.If you want to start a container that is configured to log to syslog and make it automatically latch on to the Collector container’s exposed port, use linking:docker run -it --link sumo-logic-collector:sumo ubuntu /bin/bashFrom within the container, you can then talk to the Collector listening on port 514 by using the environment variables populated by the linking:echo "I'm in ur linx" | nc -v -u -w 0 $SUMO_PORT_514_TCP_ADDR $SUMO_PORT_514_TCP_PORTThat’s all there is to it. The image is available from Docker Hub. Setting up an Access ID/Access Key combination is described in our online help.Composing Collector Images From Our Base ImageFollowing the instructions above will get you going quickly, but of course it can’t possibly cover all the various logging scenarios that we need to support. To that end, we actually started by first creating a base image. The Syslog image extends this base image. Your future images can easily extend this base image as well. Let’s take a look at what is actually going on! Here’s the Github repo:https://github.com/SumoLogic/sumologic-collector-docker.One of the main things we set out to solve was to clarify how to allow creating an image that does not require customer credentials to be baked in. Having credentials in the image itself is obviously a bad idea! Putting them into the Dockerfile is even worse. The trick is to leverage a not-so-well documented command line switch on the Collector executable to pass the Sumo Logic Access ID and Access Key combination to the Collector. Here’s the meat of the run.sh startup script referenced in the Dockerfile:<pre class="brush: plain; title: ; notranslate">/opt/SumoCollector/collector console -- -t -i $access_id -k $access_key-n $collector_name -s $sources_json</pre>The rest is really just grabbing the latest Collector Debian package and installing it on top of a base Ubuntu 14.04 system, invoking the start script, checking arguments, and so on.As part of our continuous delivery pipeline, we are getting ready to update the Docker Hub-hosted image every time a new Collector is released. This will ensure that when you pull the image, the latest and greatest code is available.How To Add The Batteries YourselfThe base image is intentionally kept very sparse and essentially ships with “batteries not included”. In itself, it will not lead to a working container. This is because the Sumo Logic Collector has a variety of ways to setup the actual log collection. It supports tailing files locally and remotely, as well as pulling Windows event logs locally and remotely.Of course, it can also act as a Syslog sink. And, it can do any of this in any combination at the same time. Therefore, the Collector is either configured manually via the Sumo Logic UI, or (and this is almost always the better way), via a configuration file. The configuration file however is something that will change from use case to use case and from customer to customer. Baking it into a generic image simply makes no sense.What we did instead is to provide a set of examples. This can be found in the same Github repository under “example”: https://github.com/SumoLogic/sumologic-collector-docker/tree/master/example. There’s a couple of sumo-source.json example files illustrating, respectively, how to set up file collection, and how to setup Syslog UDP and Syslog TCP collection. The idea is to allow you to either take one of the example files verbatim, or as a starting point for your own sumo-sources.json. Then, you can build a custom image using our image as a base image. To make this more concrete, create a new folder and put this Dockerfile in there:<pre class="brush: plain; title: ; notranslate">FROM sumologic/collectorMAINTAINER Happy Sumo CustomerADD sumo-sources.json /etc/sumo-sources.json</pre>Then, put a sumo-sources.json into the same folder, groomed to fit your use case. Then build the image and enjoy.A Full ExampleUsing this approach, if you want to collect files from various containers, mount a directory on the host to the Sumo Logic Collector container. Then mount the same host directory to all the containers that use file logging. In each container, setup logging to log into a subdirectory of the mounted log directory. Finally, configure the Collector to just pull it all in.The Sumo Logic Collector has for years been used across our customer base in production for pulling logs from files. More often than not, the Collector is pulling from a deep hierarchy of files on some NAS mount or equivalent. The Collector is quite adept and battle tested at dealing with file-based collection.Let’s say the logs directory on the host is called /tmp/clogs. Before setting up the source configuration accordingly, make a new directory for the files describing the image. Call it for example sumo-file. Into this directory, put this Dockerfile:<pre class="brush: plain; title: ; notranslate">FROM sumologic/collectorMAINTAINER Happy Sumo CustomerADD sumo-sources.json /etc/sumo-sources.json</pre>The Dockerfile extends the base image, as discussed. Next to the Dockerfile, in the same directory, there needs to be a file called sumo-sources.json which contains the configuration:<pre class="brush: plain; title: ; notranslate">{ "api.version": "v1", "sources": [ { "sourceType" : "LocalFile", "name": "localfile-collector-container", "pathExpression": "/tmp/clogs/**", "multilineProcessingEnabled": false, "automaticDateParsing": true, "forceTimeZone": false, "category": "collector-container" } ]}</pre>With this in place, build the image, and run it:<pre class="brush: plain; title: ; notranslate">docker run -d -v /tmp/clogs:/tmp/clogs -d --name="sumo-logic-collector"[image name] [your Access ID] [your Access key]</pre>Finally, add -v /tmp/clogs:/tmp/clogs when running other containers that are configured to log to /tmp/clogs in order for the Collector to pick up the files.Just like the ready-to-go syslog image we described in the beginning, a canonical image for file collection is available. See the source: https://github.com/SumoLogic/sumologic-collector-docker/tree/master/file.docker run -v /tmp/clogs:/tmp/clogs -d --name="sumo-logic-collector"sumologic/collector:latest-file [Access ID] [Access key]If you want to learn more about using JSON to configure sources to collect logs with the Sumo Logic Collector, there is a help page with all the options spelled out.That’s all for today. We have more coming. Watch this space. And yes, comments are very welcome.https://www.sumologic.com/blog... class="at-below-post-recommended addthis_tool">

Blog

Me at the End of the World