Log Management & Security Analytics, Continuous Intelligence

Michael Floyd

Posts by Michael Floyd

Blog

New DevOps Site Chronicles the Changing Face of DevOps

As organizations increasingly adopt DevOps as part of the digital initiative, the very nature of DevOps is rapidly changing. As DevOps as a Service, security-first design patterns, containerization and microservices come into focus, it’s clear that DevOps is becoming the de facto means by which organizations build, run and secure their modern applications.

Blog

Designing a Data Analytics Strategy for Modern Apps

Yesterday at AWS re:Invent 2016, Sumo Logic Co-Founder and CTO Christian Beedgen presented his vision for machine data analytics in a world where modern apps are disrupting virtually every vertical market in business. Every business is a software business, Marc Andreessen wrote more than five years ago. Today, driven by customer demand, the need to differentiate and the push for agility, digital transformation initiatives are disrupting every industry. “We are still at very the beginning of this wave of Digital Transformation,” Christian said. “By 2020 half of all businesses will have figured out digitally enhanced products and services.” The result is that modern apps are being architected differently than they were just 3 years ago. Cloud applications are being built on microservices by DevOps teams that automate to deliver new functionality faster. “It used to be that you could take the architecture and put it on a piece of paper with a couple of boxes and a couple of arrows. Our application architecture was really clean.” But with this speed and agility comes complexity, and the need for visibility has become paramount. “Today our applications look like spaghetti. Building microservices, wiring them up, integrating them so they can work with something else, foundational services, SQL databases, NoSQL databases…” You need to be able to see what’s going on, because you can’t fix what you cannot see. Modern apps require Continuous Intelligence to provide insights, continuously and in real-time, across the entire application lifecycle. Designing Your Data Analytics Strategy Ben Newton, Sumo Logic’s Principal Product Manager of the Metrics team, took the stage to look at the various types of data and what you can do with them. Designing a data analytics strategy begins by understanding the data types that are produced by machine data, then focusing on the activities that data supports. The primary activities are Monitoring where you detect and notify (or alert), and Troubleshooting where you identify, diagnose, restore and resolve. “What we often find is that users can use that same data to do what we call App Intelligence – the same logs and metrics that allows you to figure out something is broken, also tells you what your users are doing. If you know what users are doing, you can make life better for them because that’s what really matters.” So who really cares about this data? When it comes to monitoring where the focus is on user-visible functionality, it’s your DevOps and traditional IT Ops teams. Engineering and development also are responsible for monitoring their code. In troubleshooting apps where the focus is on end-to-end visibility, customer success and technical support teams also become stakeholders. For app intelligence, the focus is on user activity and visibility everyone is a stakeholder including sales, marketing, and product management. “Once you have all of this data, all of these people are going to come knocking on your door,” said Ben. Once you understand the data types you have, where it is within your stack and the use cases, you can begin to use data to solve real problems. In defining what to monitor and measure, Ben highlighted: Monitor what’s important to your business and your users. Measure and monitor user visible metrics. Build fewer, higher impact, real-time monitors. “Once you get to troubleshooting side, it gets back to you can’t fix what you can’t measure.” Ben also said: You can’t improve what you can’t measure. You need both activity metrics and detailed logs. Up to date data drives better data-driven decisions. You need data from all parts of your stack. So what types of data will you be looking at? Ben broke it down to the following categories: Infrastructure Rollups vs. Detailed What resolution makes sense? Is real-time necessary? Platform Rollups vs. Detailed Coverage of all components Detailed logs for investigations Architecture in the metadata Custom How is your service measured? What frustrates users? How does the business measure itself? “Everything you have produces data. It’s important to ensure you have all of the components covered.” Once you have all of your data, it’s important to think about the metadata. Systems are complex and the way you make sense out of it is through your metadata. You use metadata to describe or tag your data. “For the customer, this is the code you wrote yourself. You are the only people that can figure out how to monitor that. So one of the things you have to think about is the metadata. ” Cloud Cruiser – A Case Study Cloud Cruiser’s Lead DevOps Engineer, Ben Abrams, took the stage to show how the company collects data and provide some tips on tagging it with metadata. Cloud Cruiser is a SaaS app that enables you to easily collect, meter, and understand your cloud spend in AWS, Azure, and GCP. Cloud Cruiser’s customers are large enterprises and mid-market players globally distributed across all verticals, and they manage hundreds of millions of cloud spend. Cloud Cruiser had been using an Elastic (Elasticsearch, Logstash, and Kibana) stack for their log management solution. They discovered that managing their own logging solution was costly and burdensome. Ben cited the following: Operational burden was a distraction to the core business. Improved security. Ability to scale + cost. Cloud Cruiser runs on AWS (300-500 instances) and utilizes microservices written in Java using the dropwizard framework. Their front-end web app runs on Tomcat and uses Angularjs. Figure 1 shows the breadth of the technology stack: In evaluating a replacement solution, Ben said “We were spending too much time on our ELK stack.” Sumo Logic’s Unified Logs and Metrics (ULM) was also a distinguishing factor. The inclusion of metrics meant that they didn’t have to employ yet another tool that would likewise have to be managed. “Logs are what you look at when something goes wrong. But Metrics are really cool.” Ben summarized the value and benefits they achieved this way: Logs Reduced operational burden. Reduced cost. Increased confidence in log integrity. Able to reduce the number of people needing VPN. Alerting based on searches did not need ops handholding. Metrics Increased visibility in system and application health. Used in an ongoing effort with application and infrastructure changes in that we were able to reduce our monthly AWS bill by over 100%. Ben then moved into a hands on session, showing how they automate the configuration and installation of Sumo Logic collectors, and how they tag their data using source categories. Cloud Cruiser currently collects data from the following sources: Chef: automation of config and collector install Application Graphite Metrics from Dropwizard Other graphite metrics forwarded by Sensu to Sumo Logic “When I search for something I want to know what environment is it, what type of log is it, and which server role did it come from.” One of their decisions was to differentiate log data from metrics data as shown below. Using this schema allows them to search logs and metrics by environment, type of log data and corresponding Chef role. Ben walked through the Chef Cookbook they used for deploying with Chef and shared how they automate the configuration and installation of Sumo Logic collectors. For those interested, I’ll follow on this up in the DevOps Blog. A key point from Ben, though, was “Don’t log secrets.” The access ID and key should be defined elsewhere, out of scope and stored in an encrypted data bag. Ben also walked through the searches they used to construct the following dashboard. Through this one dashboard, Cloud Cruiser can utilize both metrics and log data to get an overview of the health of their production deployment. Key Takeaways Designing your data analytics strategy is highly dependent on your architecture. Ultimately it’s about the experience you provide to your users. It’s no longer just about troubleshooting issues in production environments. It’s also about understanding the experience you provide to your users. The variety of data that streams in real time comes from the application, operating environment and network layers produces an ever increasing volume of data every day. Log analytics provides the forensic data you need, and time-series based metrics give you insights into the real-time changes taking place under the hood. To understand both the health of your deployment and the behavior/experience of your customers, you need to gather machine data from all of its sources, then apply both logs and metrics to give teams from engineering to marketing the insights they need. Download the slides and view the entire presentation below:

Blog

CDN with AWS CloudFront - Tutorial

Consider a situation in which you’ve developed a groundbreaking website, and you would like to share your content with the world. The problem is that your hosting provider is based in New York, and you’re concerned that a user from a different region, such as Europe or Australia, won’t be able to view your content quickly and reliably. Amazon CloudFront is a content delivery network (CDN) solution that allows your content distribution to be shared in an accelerated manner from multiple edge locations around the world.

Blog

Sumo Logic Brings Machine Data Analytics to AWS Marketplace

The founders of Sumo Logic recognized early on when they founded the company that in order to remain competitive in an increasingly disruptive world, companies would be moving to the cloud to build what are now being called modern apps. Hence, Sumo Logic purposefully architected the Sumo Logic machine data analytics platform from the ground up on Amazon Web Services. Along the way, Sumo Logic has acquired a vast knowledge and expertise in not only log management overlaid with metrics, but in the inner workings of the services offered by Amazon. Today, more 6 years later, we are pleased to announce that Sumo Logic is one of a handful of initial AWS partners participating in the launch of SaaS Subscription products on Amazon Web Services (AWS) Marketplace, and the immediate availability of the Sumo Logic Platform in AWS Marketplace. Now, customers already using AWS can leverage Sumo Logic’s expertise in machine-data analytics to visualize and monitor workloads in real-time, identify issues and expedite root cause analysis to improve operational and security insights across AWS infrastructure and services. How it Works AWS Marketplace is an online store that allows AWS customers to procure software and services directly in the marketplace and immediately start using those services. Billing runs through your AWS account, allowing your organization to consolidate billing for all AWS services, SaaS subscriptions and software purchased through the Marketplace. To get started with Sumo Logic in the AWS Marketplace go to the Sumo Logic page. You should see a screen similar to the following. Pricing As you can see, pricing is clearly marked next to the product description. Pricing is based on several factors starting with which edition of Sumo Logic you’re using – Professional or Enterprise. Professional edition supports up to 20 users and 30 days of data retention among other features. Enterprise Edition includes support for 20+ users and multi-year data retention as part of its services. See Sumo Logic Pricing page for more information. Reserved Log Ingest Once you’ve decided which edition, you’re ready to select the plan that’s best for you based on your anticipated ingest volume. Reserved Log Ingest Volume is the amount of logs you have contracted to send each day to the Sumo Logic service. The Reserve price is how you much you pay for GB of logs ingested each day. During signup, you can select a Reserved capacity in GB’s per day (see below). There are no minimum days, and you can cancel at any time. On-Demand Log Ingest Bursting is allowed and at the end of the billing cycle, for any usage beyond the total Reserved capacity for the period, you will pay the On-demand rate. Your first 30 days of service usage are FREE. Signing up When you click Continue, you’ll be taken to the Sumo Logic signup form similar to Figure 2. Enter your email address, then click Plan to select your Reserved Log Ingest volume. At this point you will select your Reserved capacity. Plans are available for increments of 1, 3, 5, 10, 20, 30, 40 and 50 GB per day. Once you’ve selected your plan, click the signup button to be taken through the signup process. Recall, billing is managed through AWS so no credit card required. What You Get If you’re not already familiar with the Sumo Logic, the platform unifies logs, metrics and events, transforming a variety of data types into real-time continuous intelligence across the entire application lifecycle enabling organizations to build, run and secure their modern applications. Highlights of Sumo Logic include: Unified Logs and Metrics Machine learning capabilities like LogReduce and LogCompare, machine learning features to quickly identify root cause. Elasticity and bursting support without over-provisioning Data encryption at rest, PCI DSS 3.1 with log immutability, and HIPAA compliance at no additional cost. Zero log infrastructure Management overhead. Go to sumologic.com for more information Sumo Logic Apps for AWS As mentioned, Sumo Logic has tremendous expertise in AWS, and experience building and operating massively multi-tenant, highly distributed cloud systems. Sumo Logic passes that expertise along to its customers in the form of Apps for AWS services. Sumo Logic Apps for AWS contain preconfigured searches and dashboards for the most common use cases, and are designed to accelerate your time to value with Sumo Logic. Using these dashboards and searches you can quickly get an overview of your entire AWS application at the app, system and network levels. You can quickly identify operational issues, drill down using search and apply tools like LogReduce and LogCompare to quickly get at the root cause of the problem. You also gain operational, security and business insight into services that support your app like S3, CloudTrail, VPC Flow and Lambda. Apps that are generally available include: Amazon S3 Audit App Amazon VPC Flow Logs App Amazon CloudFront App AWS CloudTrail App AWS Config App AWS Elastic Load Balancing App AWS Lambda App In addition, the following Apps are in Preview for Sumo Logic Customers: Amazon CloudWatch – ELB Metrics Amazon RDS Metrics Getting Started and Next Steps Sumo Logic is committed to educating its customers using the deep knowledge and expertise it has gained in working with AWS. If you’re new to Amazon Web Services, we’ve created AWS Hub, a portal dedicated to learning AWS fundamentals. The portal includes 101’s to get you started with EC2, S3, ELB, VPC Flow and AWS Lambda. In addition you’ll find deep-dive articles and blog posts walking you through many of the AWS service offerings. Finally, if you’re planning to attend AWS re:Invent at the end of November, stop by and get your questions answered, or take a quick tour of Sumo Logic and all machine learning and data analytics has to offer.

Blog

Troubleshooting Apps and Infrastructure Using Puppet Logs

If you’re working with Puppet, there’s a good chance you’ve run into problems when configuring a diverse infrastructure. It could be a problem with authorization and authentication, or perhaps with MCollective, Orchestration or Live Management. Puppet logs can provide a great deal of insight into the status of your apps and infrastructure across the data center. Knowing where to look is just the first step. Knowing what to look for is another matter. Here’s a cheat sheet that will help you identify the logs that are most useful, and show you what to look for. I’ll also explain how to connect Puppet to a log aggregation and analytics tool like Sumo Logic. Where are Puppet Logs Stored? The Puppet Enterprise platform produces a variety of log files across its software architecture. This Puppet documentation describes the file path of the following types of log files: Master Logs: Master application logs containing information such as fatal errors and reports, warnings, and compilation errors. Agent Logs: Information on client configuration retrieved from the Master. ActiveMQ Logs: Information on the ActiveMQ actions on specific nodes. MCollective service logs: Information on the MCollective actions on specific nodes. Console logs: Information around console errors, fatal errors and crash reports. Installer logs: Contains information around Puppet installations, such as errors occurred installation, last installation run and other relevant information. Database logs: Information around database modifications, errors, etc. Orchestration logs: Information around orchestration changes. The root of Puppet log storage is different depending on whether Puppet is running in a Unix-like system or in a Windows environment. For *nix-based installs, the root folder for Puppet is /etc/puppetlabs/puppet. For Windows-based installs the root folder for Puppet is: C:\ProgramData\PuppetLabs\puppet\etc for all versions of Windows server from 2008 onwards. Modifying Puppet Log Configuration The main setting that needs to be configured correctly to get the best from Puppet logs is the log_level attribute within the main Puppet configuration file. The log_level parameter can have the following values, with “notice” being the default value: debug info notice warning err alert emerg crit The Puppet Server can also be configured to process logs. This is done using the Java Virtual Machine Logback library. An .xml file is created, usually named logback.xml, which can be piped into the Puppet Server at run time. If a different filename is used, it will need to be specified in the global.conf file. The .xml file allows you to override the default root logging level of ‘info’. Possible levels are trace, debug, info and debug. For example, if you wanted to produce full debug data for Puppet, you would add the following parameter to the .xml file. <root level=”debug”> The Most Useful Puppet Logs Puppet produces a number of very useful log files, from basic platform logs to full application orchestration reports. The most commonly used Puppet logs include: Platform Master Logs These give generalized feedback on issues such as compilation errors, depreciation warnings, and crash/fatal termination. They can be found at the following locations: /var/log/puppetlabs/puppetserver/puppetserver.log /var/log/puppetlabs/puppetserver/puppetserver-daemon.log Application Orchestration Logs Application orchestration is probably the single most attractive aspect of the Puppet platform. It enables the complete end-to-end integration of the DevOps cycle into a production software application. As a result, these logs are likely to be the most critical logs of all. They include: /var/log/pe-mcollective/mcollective.log – This log file contains all of the log entries that affect the actual MCollective platform. This is a good first place to check if something has gone wrong with application orchestration. /var/lib/peadmin/.mcollective.d/client.log – a log file to be found on the client connecting to the MCollective server, the twin to the log file above, and the second place to begin troubleshooting. /var/log/pe-activemq/activemq.log – a log file that contains entries for ActiveMQ. /var/log/pe-mcollective/mcollective-audit.log – a top-level view of all MCollective requests. This could be a good place to look if you are unsure of exactly where the problem occurred so that you can highlight the specific audit event that triggered the problem. Puppet Console Logs Also valuable are Puppet logs, which include the following: /var/log/pe-console-services/console-services.log – the main console log that contains entries for top-level events and requests from all services that access the console. /var/log/pe-console-services/pe-console-services-daemon.log – low-level console event logging that occurs before the standard logback system is loaded. This is a useful log to check if the problem involves the higher level logback system itself. /var/log/pe-httpd/puppet-dashboard/access.log – a log of all HTTPS access requests made to the Puppet console. Advanced Logging Using Further Tools The inbuilt logging functions of Puppet mostly revolve around solving issues with the Puppet platform itself. However, Puppet offers some additional technologies to help visualize status date. One of these is the Puppet Services Status Check. This is both a top-level dashboard and a queryable API that provides top-level real-time status information on the entire Puppet platform. Puppet can also be configured to support Graphite. Once this has been done, a mass of useful metrics can be analyzed using either the demo Graphite dashboard provided, or using a custom dashboard. The ready-made Grafana dashboard makes a good starting point for measuring application performance that will affect end users, as it includes the following metrics by default: Active requests – a graphical measure of the current application load. Request durations – a graph of average latency/response times for application requests. Function calls – graphical representation of different functions called from the application catalogue. Potentially very useful for tweaking application performance. Function execution time – graphical data showing how fast specific application processes are executed. Using Puppet with Sumo Logic To get the very most out of Puppet log data, you can analyze the data using an external log aggregation and analytics platform, such as Sumo Logic. To work with Puppet logs on Sumo Logic, you simply use the Puppet module for installing the Sumo Logic collector. You’ll then be able to visualize and monitor all of your Puppet log data from the Sumo Logic interface, alongside logs from any other applications that you connect to Sumo Logic. You can find open source collectors for Docker, Chef, Jenkins, FluentD and many other servers at Sumo Logic Developers on Github. About the Author Ali Raza is a DevOps consultant who analyzes IT solutions, practices, trends and challenges for large enterprises and promising new startup firms. Troubleshooting Apps and Infrastructure Using Puppet Logs is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

Blog

Sumo Logic Launches New Open Source Site on Github Pages

This week we deployed our new Github pages site, Sumo Logic Developers, to showcase some of the cool open-source projects that are being built in and around Sumo Logic. The goal is to share the knowledge and many projects that are being built by our customers, partners and everything in the wild. Aside from the official Sumo Logic repositories, there’s also a lot of tribal knowledge: Our Sales Engineers and Customer Success teams work daily with our customers to solve real-world problems. Often those solutions include search queries, field extraction rules, useful regular expressions, and configuration settings that are often captured in Github and other code registries. Sumo Logic Developers aggregates this knowledge in one place under Sumo Experts. Likewise, we’re seeing more organizations creating plugins and otherwise integrating Sumo Logic into to their products using our REST APIs. So under Integrations you’ll find a Jenkins Publisher plugin, along with tools for working with Twillio, Chef, DataDog, FluentD and many others. Sumo Logic Developers also provides everything our customers need to collect and ingest data including APIs and documentation, educational articles and walkthroughs from our Dev Blog, and links to all of our supported apps and integrations. The project was an enjoyable break from daily tasks and for those considering building a Github Pages site, I’d like to share my experience in building Sumo Logic Developers. If you’d like to contribute or add your project or repo to one of the lists on Sumo Logic Developers simply reply to this thread on our community site, Sumo Dojo. Alternatively, you can ping me on Twitter @codejournalist. Deploying Sumo Logic Developers on Github Pages Sumo Logic Developers is built on Github pages using Jekyll as a static site generator. Github pages are served from a special repository associated with your User or Organization account on Github. So all the code is visible unless you make it private. Github pages supports Jekyll natively so it’s easy to run builds on your local machine and review changes on a localhost before deploying to Github. Using Bootstrap I used the Bootstrap JavaScript framework to make the site responsive and mobile friendly. In particular Bootstrap CSS uses a grid system that uses rows and up to 12 columns. The grid scales as the device or viewport size increases and decreases. It includes predefined classes for easy layout options, so I can specify a new row like this: <div class=”row”> Within the row, I can designate the entire width of the screen like this: <div class=”col-xs-12 col-sm-12 col-md-12 col-lg-12 hidden-xs”> This create a full screen area using all 12 columns. Note that Bootstrap CSS lets you specify different sizes for desktop, laptop, tablets and mobile phones. This last example hides the content in this on extra small mobile screens. The following creates an area of 4 units on large and medium screens (most desktops and laptops, 6 units on small screens (most tablets) 8 units on phone devices. <div class=”col-lg-4 col-md-4 col-sm-6 col-xs-8″> Using Jekyll and Git Jekyll is a static site generator that includes the Liquid template engine to create data-driven sites. So using Liquid’s template language I was able to populate open-source projects into the site using data read from a Yaml file. This feature allows our internal teams to quickly update the site without having to muck in the code. A typical entry looks like: - title: "Sumo Report Generator" repo_description: "Tool that allows a user to execute multiple searches, and compile the data into a single report." url: "https://github.com/SumoLogic/sumo-report-generator" We use a Github workflow, so team members can simply create a new branch off of the master, edit the title, repo_description and url, then make a pull request to merge the request. You can find all of the source code at https://github.com/SumoLogic/sumologic.github.io Contributing to the Github Pages Site At it’s heart, Sumo Logic Developers is really a wiki – it’s informative with an amazing amount of rich content, it’s based on the work of our community of developers and DevOps practitioners, and it operates on a principle of collaborative trust. It’s brimming with tribal knowledge – the lessons learned from those who’ve travelled before you. In fact, the site currently represents: 30 Official Sumo Logic repositories ~ 20 Individually maintained repos by our customers, as well as SE, Customer Success and Engineering teams. 100+ Gists containing scripts, code snippets, useful regular expressions, etc. 3rd-party integrations with Jenkins, FluentD, Chef, Puppet and others. 100+ Blogs Links to our community, API’s and documentation Our main task now is finding new projects and encouraging members like you to contribute. The goal is to empower the tribe – all of those who use Sumo Logic to build, run and secure their modern applications. If you’d like to contribute, post your project on Sumo Dojo, or ping us @sumologic. Michael is the Head of Developer Programs at Sumo Logic. You can follow him on Twitter @codejournalist. Read more of Michael’s posts on Sumo Logic’s DevOps Blog.

Blog

Customers Share their AWS Logging with Sumo Logic Use Cases

In June Sumo Dojo (our online community) launched a contest to learn more about how our customers are using Amazon Web Services like EC2, S3, ELB, and AWS Lambda. The Sumo Logic service is built on AWS and we have deep integration into Amazon Web Services. And as an AWS Technology Partner we’ve collaborated closely with AWS to build apps like the Sumo Logic App for Lambda.So we wanted to see how our customers are using Sumo Logic to do things like collecting logs from CloudWatch to gain visibility into their AWS applications. We thought you’d be interested in hearing how others are using AWS and Sumo Logic, too. So in this post I’ll share their stories along with announcing the contest winner.The contest narrowed down to two finalists – SmartThings, which is a Samsung company operates in the home automation industry and provides access to a wide range of connected devices to create smarter homes that enhance the comfort, convenience, security and energy management for the consumer.WHOmentors, Inc. our second finalist, is a publicly supported scientific, educational and charitable corporation, and fiscal sponsor of Teen Hackathon. The organization is, according to their site, “primarily engaged in interdisciplinary applied research to gain knowledge or understanding to determine the means by which a specific, recognized need may be met.”At stake was a DJI Phantom 3 Drone. All entrants were awarded a $10 Amazon gift card.AWS Logging Contest RulesThe Drone winner was selected based on the following criteria:You have to be a user of Sumo Logic and AWSTo enter the contest, a comment had to be placed on this thread in Sumo Dojo.The post could not be anonymous – you were required to log in to post and enter.Submissions closed August 15th.As noted in the Sumo Dojo posting, the winner would be selected based on our own editorial judgment and community reactions to the post (in the form of comments or “likes”) to select one that’s most interesting, useful and detailed.SmartThingsSmartThings has been working on a feature to enable Over-the-air programming (OTA) firmware updates of Zigbee Devices on user’s home networks. For the uninitiated, Zigbee is an IEEE specification for a suite of high-level communication protocols used to create personal area networks with small, low-power digital radios. See the Zigbee Alliance for more information.According to one of the firmware engineers at SmartThings, there are a lot of edge cases and potential points of failure for an OTA update including:The Cloud PlatformAn end user’s hubThe device itselfPower failuresRF inteference on the mesh networkDisaster in this scenario would be a user’s device ending up in a broken state. As Vlad Shtibin related:“Our platform is deployed across multiple geographical regions, which are hosted on AWS. Within each region we support multiple shards, furthermore within each shard we run multiple application clusters. The bulk of the services involved in the firmware update are JVM based application servers that run on AWS EC2 instances.Our goal for monitoring was to be able to identify as many of these failure points as possible and implement a recovery strategy. Identifying these points is where Sumo Logic comes into the picture. We use a key-value logger with a specific key/value for each of these failure points as well as a correlation ID for each point of the flow. Using Sumo Logic, we are able to aggregate all of these logs by passing the correlation ID when we make calls between the systems.Using Sumo Logic, we are able to aggregate all of these logs by passing the correlation ID when we make calls between the systems.We then created a search query (eventually a dashboard) to view the flow of the firmware updates as they went from our cloud down to the device and back up to the cloud to acknowledge that the firmware was updated. This query parses the log messages to retrieve the correlation ID, hub, device, status, firmware versions, etc.. These values are then fed into a Sumo Logic transaction enabling us to easily view the state of a firmware update for any user in the system at a micro level and the overall health of all OTA updates on the macro level.Depending on which part of the infrastructure the OTA update failed, engineers are then able to dig in deeper into the specific EC2 instance that had a problem. Because our application servers produce logs at the WARN and ERROR level we can see if the update failed because of a timeout from the AWS ElasticCache service, or from a problem with a query on AWS RDS. Having quick access to logs across the cluster enables us to identify issues across our platform regardless of which AWS service we are using.As Vlad noted, This feature is still being tested and hasn’t been rolled out fully in PROD yet. “The big take away is that we are much more confident in our ability identify updates, triage them when they fail and ensure that the feature is working correctly because of Sumo Logic.”WHOmentors.comWHOmentors.com, Inc. is a nonprofit scientific research organization and the 501(c)(3) fiscal sponsor of Teen Hackathon. To facilitate their training to learn languages like Java, Python, and Node.js, each individual participate begins with the Alexa Skills Kit, a collection of self-service application program interfaces (APIs), tools, documentation and code samples that make it fast and easy for teens to add capabilities for use Alexa-enabled products such as the Echo, Tap, or Dot.According WHOmentors.com CEO, Rauhmel Fox, “The easiest way to build the cloud-based service for a custom Alexa skill is by using AWS Lambda, an AWS offering that runs inline or uploaded code only when it’s needed and scales automatically, so there is no need to provision or continuously run servers.With AWS Lambda, WHOmentors.com pays only for what it uses. The corporate account is charged based on the number of requests for created functions and the time the code executes. While the AWS Lambda free tier includes one million free requests per month and 400,000 gigabyte (GB)-seconds of compute time per month, it becomes a concern when the students create complex applications that tie Lambda to other expensive services or the size of their Lambda programs are too long.Ordinarily, someone would be assigned to use Amazon CloudWatch to monitor and troubleshoot the serverless system architecture and multiple applications using existing AWS system, application, and custom log files. Unfortunately, there isn’t a central dashboard to monitor all created Lambda functions.With the integration of a single Sumo Logic collector, WHOmentors.com can automatically route all Amazon CloudWatch logs to the Sumo Logic service for advanced analytics and real-time visualization using the Sumo Logic Lambda functions on Github.”Using the Sumo Logic Lambda Functions“Instead of a “pull data” model, the “Sumo Logic Lambda function” grabs files and sends them to Sumo Logic web application immediately. Their online log analysis tool offers reporting, dashboards, and alerting as well as the ability to run specific advanced queries as needed.The real-time log analysis combination of the “SumoLogic Lambda function” assists me to quickly catch and troubleshoot performance issues such as the request rate of concurrent executions that are either stream-based event sources, or event sources that aren’t stream-based, rather than having to wait hours to identify whether there was an issue.I am most concerned about AWS Lambda limits (i.e., code storage) that are fixed and cannot be changed at this time. By default, AWS Lambda limits the total concurrent executions across all functions within a given region to 100. Why? The default limit is a safety limit that protects the corporate from costs due to potential runaway or recursive functions during initial development and testing.As a result, I can quickly determine the performance of any Lambda function and clean up the corporate account by removing Lambda functions that are no longer used or figure out how to reduce the code size of the Lambda functions that should not be removed such as apps in production.”The biggest relief for Rauhmel is he is able to encourage the trainees to focus on coding their applications instead of pressuring them to worry about the logs associated with the Lambda functions they create.And the Winner of AWS Logging Contest is…Just as at the end of an epic World-Series battle between two MLB teams, you sometimes wish both could be declared winner. Alas, there can only be one. We looked closely at the use cases, which were very different from one another. Weighing factors like the breadth in the usage of the Sumo Logic and AWS platforms added to our drama. While SmartThings uses Sumo Logic broadly to troubleshoot and prevent failure points, WHOmentors.com use case is specific to AWS Lambda. But we couldn’t ignore the cause of helping teens learn to write code in popular programming languages, and building skills that may one day lead them to a job.Congratulations to WHOmentors.com. Your Drone is on its way!

Blog

Dockerizing Microservices for Cloud Apps at Scale

Last week I introduced Sumo Logic Developers’ Thought Leadership Series where JFrog’s Co-founder and Chief Architect, Fred Simon, came together with Sumo Logic’s Chief Architect, Stefan Zier, to talk about optimizing continuous integration and delivery using advanced analytics. In Part 2 of this series, Fred and Stefan dive into Docker and Dockerizing microservices. Specifically, I asked Stefan about initiatives within Sumo Logic to Dockerize parts of its service. What I didn’t realize was the scale at which these Dockerized microservices must be delivered. Sumo Logic is in the middle of Dockerizing its architecture and is doing it incrementally. As Stefan says, “We’ve got a 747 in mid-air and we have to be cautious as to what we do to it mid-flight.” The goal in Dockerizing Sumo Logic is to gain more speed out of the deployment cycle. Stefan explains, “There’s a project right now to do a broader stroke containerization of all of our microservices. We’ve done a lot of benchmarking of Artifactory to see what happens if a thousand machines pull images from Artifactory at once. That is the type of scale that we operate at. Some of our microservices have a thousand-plus instances of the service running and when we do an upgrade we need to pull a thousand-plus in a reasonable amount of time – especially when we’re going to do continuous deployment: You can’t say ‘well we’ll roll the deployment for the next three hours then we’re ready to run the code,’ That’s not quick enough anymore. It has to be minutes at most to get the code out there.” The Sumo Logic engineering team has learned a lot in going through this process. In terms of adoption and learning curve Stefan suggests: Developer Education – Docker is a new and foreign thing and the benefits are not immediately obvious to people. Communication – Talking through why it’s important and why it’s going to help and how to use it. Workshops – Sumo Logic does hands-on workshops in-house to get its developers comfortable with using Docker. Culture – Build a culture around Docker. Plan for change – the tool chain is still evolving. You have to anticipate the evolution of the tools and plan for it. As a lesson learned, Stefan explains, “We’ve had some fun adventures on Ubuntu – in production we run automatic upgrades for all our patches so you get security upgrades automatically. It turns out when you get an upgrade to the Docker Daemon it kills all the running containers. We had one or two instances where, this wasn’t in production fortunately, but in one or two instances we experienced where across the fleet all containers went away. Eventually we traced it back to Docker Daemon and now we’re explicitly holding back Docker daemon upgrades and make it an explicit upgrade so that we are in control of the timing. We can do it machine by machine instead of the whole fleet at once.” JFrog on Dockerizing Microservices Fred likewise shared JFrog’s experiences, pointing out that JFrog’s customers asked early on for Docker support. So JFrog has been in it from the early days of Docker. Artifactory has supported Docker images for more than 2 years. To Stefan’s point, Fred says “we had to evolve with Docker. So we Dockerized our pure SaaS [product] Bintray, which is a distribution hub for all the packages around the world. It’s highly distributed across all the continents, CDN enabled, [utilizes a] MongoDB cluster, CouchDB, and all of this problematic distributed software. Today Bintray is fully Dockerized. We use Kubernetes for orchestration.” One of the win-wins for Frog developers is that the components the developer is “not” working on are delivered via Docker, the exact same containers that will run in production, on their own local workstation. ‘We use Vagrant to run Docker inside a VM with all the images so the developer can connect to microservices exactly the same way. So the developer has the immediate benefit that he doesn’t have to configure and install components developed by the other team. Fred also mentioned Xray, which was just released, is fully Dockerized. Xray analyzes any kind of package within Artifactory including Docker images, Debian, RPM, zip, jar, war files and analyzes what it contains. “That’s one of the things with Docker images, it’s getting hard to know what’s inside it. Xray is based on 12 microservices and we needed a way to put their software in the hands of our customers, because Artifactory is both SaaS and on-prem, we do both. So JFrog does fully Docker and Docker Compose delivery. So developers can get the first image and all images from Bintray.” “The big question to the community at large,” Fred says, “is how do you deliver microservices software to your end customer?” There is still some work to be done here.” More Docker Adventures – TL;DR Adventures is a way of saying, we went on this journey, not everything went as planned and here’s what we learned from our experience. If you’ve read this far, I’ve provided a good summary of the first 10 minutes, so you can jump there to learn more. Each of the topics are marked by a slide so you can quickly jump to a topic of interest. Those include: Promoting containers. Why it’s important to promote your containers at each stage in the delivery cycle rather than retag and rebuild. Docker Shortcuts. How Sumo Logic is implementing Docker incrementally and taking a hybrid approach versus doing pure Docker. Adventures Dockerizing Cassandra. Evolving Conventions for Docker Distribution. New Shifts in Microservices What are the new shifts in microservices? In the final segment of this series, Fred and Stefan dive into microservices and how they put pressure on your developers to create clean APIs. Stay tuned for more adventures building, running and deploying microservices in the cloud. https://www.sumologic.com/blog... class="at-below-post-recommended addthis_tool">

Blog

CI/CD, Docker and Microservices - by JFrog and Sumo Logic’s Top Developers

The JFrog and SumoLogic collaboration will deliver industry-first technology offering insights into their software development operations as they build, run and secure their modern applications.

Blog

Sumo Dojo Winners - Using Docker, FluentD and Sumo Logic for Deployment Automation

Recently, Sumo Dojo ran a contest in the community see who is analyzing Docker logs with Sumo Logic, and how. The contest ran the month of June and was presented at DockerCon. Last week, the Sumo Dojo selected the winner, Brandon Milsom, from Australia-based company Fugro Roames. Roames uses remote sensing laser (LIDAR) technology to create interactive 3D asset models for powerline networks for energy companies in Australia and the United Kingdom. As Brandon writes: "We use Docker and Sumo Logic as a part of our deployment automation. We use Ansible scripts to automatically deploy our developer’s applications onto Amazon EC2 instances inside Docker containers as part of our cloud infrastructure. These applications are automatically configured to send tagged logs to Sumo Logic using Fluentd, which our developers use to identify their running instances for debugging and troubleshooting. Not only are the application logs sent directly to Sumo Logic, but the Docker container logs are also configured using Docker’s built in Fluentd logging driver. This forwards logs to another Docker container on the same host running a Fluentd server, which then seamlessly ships logs over to Sumo Logic. The result is developers easily access their application and container OS logs that their app is running in just by opening a browser tab." Part of our development has also been trialling drones for asset inspection, and we also have a few drone fanatics in our office. Winning a drone would also be beneficial as it would give us something to shoot at with our Nerf guns, improving office morale. Brandon's coworker, Adrian Howchin also wrote in saying" "I think one of the best things that we've gained from this setup is that it allows us to keep users from connecting (SSH) in to our instances. Given our CD setup, we don't want users connecting in to hosts where their applications are deployed (it's bad practice). However, we had no answer to the question of how they get their application/OS logs." Thanks to SumoLogic (and the Docker logging driver!), we're able to get these logs out to a centralized location, and keep the users out of the instances. Congratulations to Brandon and the team at Fugro Roames. Now you have something cool to shoot at.

Blog

JFrog Artifactory Users Gain Real-time Continuous Intelligence with New Partnership

Blog

Correlating Logs and Metrics

This week, CEO Ramin Sayar offered insights into Sumo Logic’s Unified Logs and Metrics announcement, noting that Sumo Logic is now the first and foremost cloud-native, machine data analytics SaaS to handle log data and time-series metrics together. Beginning this week Sumo Logic is providing “early access” to customers that are using either Amazon CloudWatch or Graphite to gather metrics. That’s good news for practitioners from developers to DevOps and release managers, because as Ben Newton explains in his blog post you’ll now be able to view both logs and metrics data together and in context. For example, when troubleshooting an application issue, developers can start with log data to narrow a problem to a specific instance, then overlay metrics to build screens that show both logs and metrics (like CPU utilization over time) in the context of the problem. What Are you Measuring? Sumo Logic already provides log analytics at three levels: System (or machine) Network Application Unified Logs & Metrics also extends the reporting of time-series data to these three levels. So using Sumo Logic you’ll now be able to focus on application performance metrics, infrastructure metrics, custom metrics and log events. Custom Application Metrics Of the three, application metrics can be the most challenging because as your application changes, so do the metrics you need to see. Often you don’t know what you will be measuring until you encounter the problem. APM tools provide byte-code instrumentation where they load code into the JVM. That can be helpful, but results are restricted to what the APM tool is designed or configured to report on. Moreover, the cost for instrumenting code using APM tools can be expensive. So developers, who know their code better than any tool, often resort to creating their own custom metrics to get the information needed to track and troubleshoot specific application behavior. That was the motivation behind an open-source tool called StatsD. StatsD allows you to create new metrics in Graphite just by sending it data for that metric. That means there’s no management overhead for engineers to start tracking something new: simply give StatsD a data point you want to track and Graphite will create the metric. Graphite itself has become a foundational monitoring tool, and because many of our customers already use it Sumo Logic felt it important to support it. Graphite, which is written in Python and open-sourced under the Apache 2.0 license, collects, stores and displays time-series data in real time. Graphite is fairly complex, but the short story is that it’s good at graphing a lot of different things like dozens of performance metrics from thousands of servers. So typically you write an application that collects numeric time-series data and sends it to Graphite’s processing backend (Carbon), which stores the data in a Graphite database. The Carbon process listens for incoming data but does not send any response back to the client. Client applications typically publish metrics using plaintext, but can also use the pickle protocol, or Advanced Message Queueing Protocol (AMQP). The data can then be visualized through a web interface like Grafana. But as previously mentioned, your custom application can simply send data points to a StatsD server. Under the hood StatsD is a simple NodeJS daemon that listens for messages on a UDP port, then parses the messages, extracts the metrics data, and periodically (every 10 seconds) flushes the data to graphite. Sumo Logic’s Unified Logs and Metrics Getting metrics into Sumo Logic is super easy. With StatsD and Graphite, you have two options. You can point your StatsD server to a Sumo Logic hosted collector or you can install native collector within the application environment. CloudWatch CloudWatch is Amazon’s service for monitoring applications running on AWS and system resources. CloudWatch tracks metrics (data expressed over a period of time) and monitors log files for EC2 Instances and other AWS resources like EBS volumes, ELB, DynamoDB tables, and so on. For EC2 Instances, you can collect metrics on things like CPU Utilization, then apply dimensions to filter by instance ID, instance type, or image id. Pricing for AWS CloudWatch is based on Data Points. A DP = 5 minute of activity (specifically the previous minutes). A Detailed DP (DDP) = 1 minute. Unified Logs and Metrics dashboards allow you to view metrics by category, and are grouped first by namespace, and then by the various dimension combinations within each namespace. One very cool feature is you can search for meta tags across EC2 instances. Sumo Logic makes the call once to retrieve meta tags and caches them. That means you no longer have to make an API call to retrieve each meta tag, which can result in cost savings since AWS charges per API call. Use Cases Monitoring – Now you’ll be able to focus on tracking KPI behavior over time with Dashboards and Alerts. Monitoring allows you to: Track SLA adherence Watch for anomalies Respond quickly to emerging issues Compare to past behavior Troubleshooting – This about determining if there is an outage and then restoring service. With Unified Logs and Metrics you can: Identify what is failing Identify when it changed Quickly iterate on ideas “Swarm” issues Root-cause Analysis – Focuses on determining why something happened and how to prevent it.Dashboards overlayed with log data and metrics allows you to: Perform historical analysis Correlate Behavior Uncover long term fixes Improve Monitoring Correlating Logs and Metrics When you start troubleshooting you really want to start correlating multiple types of metrics and multiple sources of log data. Ultimately, you’ll be able to start with Outliers and begin overlaying metrics and log data to quickly build views and help you quickly identify issues. Now you’ll be able to overlay log and metrics from two different systems and do it in real time. If you want to see what Unified Logs and Metrics can do, Product Manager Ben Newton walks you through the steps of building on logs and overlaying metrics in this short introduction.

Blog

Containerization: Enabling DevOps Teams

What is containerization? Software containers are a form of OS virtualization where the running container includes just the minimum operating system resources, memory and services required to run an application or service. Containers enable developers to work with identical development environments and stacks. But they also facilitate DevOps by encouraging the use of stateless designs.

Blog

Sumo Logic’s Christian Beedgen Speaks on Docker Logging and Monitoring

Support for Docker logging has evolved over the past two years, and the improvements made from Docker 1.6 to today have greatly simplified both the process and the options for logging. However, DevOps teams are still challenged with monitoring, tracking and troubleshooting issues in a context where each container emits its own logging data. Machine data can come from numerous sources, and containers may not agree on a common method. Once log data has been acquired, assembling meaningful real-time metrics such as the condition of your host environment, the number of running containers, CPU usage, memory consumption and network performance can be arduous. And if a logging method fails, even temporarily, that data is lost. Sumo Logic’s co-founder and CTO, Christian Beedgen presented his vision for comprehensive container monitoring and logging to the 250+ developers that attended the Docker team’s first Meetup at Docker HQ in San Francisco this past Tuesday. Docker Logging When it comes to logging in Docker, the recommended pathway for developers has been for the container to write to its standard output, and let Docker collect the output. Then you configure Docker to either store it in files, or send it to syslog. Another option is to write to a directory, so the plain log file is the typical /var/log thing, and then you share that directory with another container. In practice, When you stop the first container, you indicate that /var/log will be a “volume,” essentially a special directory, that can then be shared with another container. Then you can run tail -f in a separate container to inspect those logs. Running tail by itself isn’t extremely exciting, but it becomes much more meaningful if you want to run a log collector that takes those logs and ships them somewhere. The reason is you shouldn’t have to synchronize between application and logging containers (for example, where the logging system needs Java or Node.js because it ships logs that way). The application and logging containers should not have to agree on specific dependencies, and risk breaking each others’ code. But as Christian showed, this isn’t the only way to log in Docker. Christian began the presentation by reminding developers of the 12-Factor app, a methodology for building SaaS applications, recommending that you limit to one process per container as a best practice, with each running unbuffered and sending data to Stdout. He then introduced the numerous options for container logging from the pre-Docker 1.6 days forward, and quickly enumerated them saying that some were better than others. You could: Log Directly from an Application Install a File Collector in the Container Install a File as a Container Install a Syslog Collector as a Container Use Host Syslog for Local Syslog Use a Syslog Container for Local Syslog Log to Stdout and use a file collector Log to StdOut and use Logspout Collect from the Docker File systems (Not recommended) Inject Collector via Docker Exec Logging Drivers in Docker Engine Christian also talked about logging drivers, which he believes have been a very large step forward in the last 12 months. He stepped through incremental logging enhancements made to Docker from 1.6 to today. Docker 1.6 added 3 new log drivers: docker logs, syslog, and log-driver null. The driver interface was meant to support the smallest subset available for logging drivers to implement their functionality. Stdout and stderr would still be the source of logging for containers, but Docker takes the raw streams from the containers to create discrete messages delimited by writes that are then sent to the logging drivers. Version 1.7 added the ability to pass in parameters to drivers, and in Docker 1.9 tags were made available to other drivers. Importantly, Docker 1.10 allows syslog to run encrypted, thus allowing companies like Sumo Logic to send securely to the cloud. He noted recent proposals for Google Cloud Cloud Logging driver, and the TCP, UDP, Unix Domain Socket driver. “As part of the Docker engine, you need to go through the engine commit protocol. This is good, because there’s a lot of review stability. But it is also suboptimal because it is not really modular, and it adds more and more dependencies on third party libraries.” So he poses the question of whether this should be decoupled. In fact, others have suggested the drivers be external plugins, similar to how volumes and networks work. Plugins would allow developers to write custom drivers for their specific infrastructure, and it would enable third-party developers to build drivers without having to get them merged upstream and wait for the next Docker release. A Comprehensive Approach for Monitoring and Logging As Christian stated, “you can’t live on logs alone.” To get real value from machine-generated data, you need to look at what he calls “comprehensive monitoring.” There are five requirements to enable comprehensive monitoring: Events Configurations Logs Stats Host and daemon logs For events, you can send each event as a JSON message, which means you can use JSON as a way of logging each event. You enumerate all running containers, then start listening to the event stream. Then you start collecting each running container and each start event. For configurations, you call the inspect API and send that in JSON, as well. “Now you have a record,” he said. “Now we have all the configurations in the logs, and we can quickly search for them when we troubleshoot.” For logs, you simply call the logs API to open a stream and send each log as, well, a log. Similarly for statistics, you call the stats API to open a stream for each running container and each start event, and send each received JSON message as a log. “Now we have monitoring,” says Christian. “For host and daemon logs, you can include a collector into host images or run a collector as a container. This is what Sumo Logic is already doing, thanks to the API.” Summary Perhaps it is a testament to the popularity of Docker, but even the Docker team seemed surprised by the huge turnout for this first meetup at HQ. As proud sponsor Sumo Logic of the meetup, we look forward to new features in Docker 1.10 aimed at enhancing container security including temporary file systems, seccomp profiles, user namespaces, and content addressable images. If you’re interested in learning more about Docker logging and monitoring, you can download Christian’s Docker presentation on Slideshare.

Blog

Introducing Sumo Logic Live Tail

In my last post I wrote about how DevOps’ emphasis on frequent release cycles leads to the need for more troubleshooting in production, and that developers are being frequently being drawn into that process. Troubleshooting applications in production isn’t always easy: For developers, the first course of action is to drop down to terminal, ssh into the environment (assuming you have access) and begin tailing log files to determine the current state. When the problem isn’t immediately obvious, they might tail -f the logs to a file, then grep for specific patterns. But there’s no easy way to search log tails in real time. Until now. Now developers and team members have a new tool, called Sumo Logic Live Tail, that lets you tail log files into a window, filter for specific conditions and utilize other cool features to troubleshoot in real time. Specifically, Live Tail lets you: Pause the log stream, scroll up to previous messages, then jump to the latest log line and resume the stream. Create keywords that will then be used to highlight occurrences within the log stream. Filter log files on-the-fly in real time Tail multiple log files simultaneously by multi-tailing Launch Sumo Logic Search in context of Sumo Logic Live Tail (and vice versa) Live Tail is immediately available from with the Sumo Logic environment, and coming soon is a command line interface (CLI) that will allow developers to launch live tail directly from the command line. What Can I Do With Live Tail? Troubleshoot Production Logs in Real Time You can now troubleshoot without having to log into business critical applications. Users also can harness the power of Sumo Logic by being able to launch Search in the context of Live Tail and vice versa. There is simply no need to go between different tools to get the data you need. Save Time Requesting and Exporting Log Files As I mentioned, troubleshooting applications in production with tail-f isn’t always easy. First, you need to gain access to production log files. For someone managing sensitive data, admins may be reluctant to grant that access. Live Tail allows you to view your most recent logs in real time, analyze them in context, copy and share every time via secure email when there’s an outage, and set up searches based on live tail results using Sumo Logic. Consolidate Tools to Reduce Costs In the past, you may have toggled between two tools: one for tailing your logs and another for advanced analytics for pattern recognition to help with troubleshooting, proactive problem identification and user analysis. With Sumo Logic Live Tail, you can now troubleshoot from the Sumo Logic browser interface or from a Sumo Logic Command Line Interface without investing in a separate solution for live tail, thereby reducing the cost of owning licenses for multiple tools. Getting Started There are a couple of ways to initiate a Live Tail session. From the Sumo Logic web app: Go directly to Live Tail by hovering over the Search menu and clicking on the Live Tail menu item; or From an existing search, click the Live Tail link (just below the search interface). In both instances, you’ll need to enter the name of the _sourceCategory, _sourceHost, _sourceName, _source, or _collector of the log you want to tail, along with any filters. Click Run to initiate the search query. That will bring up a session similar to Figure 1. Figure 1. A Live Tail session. To find specific information, such as errors and exceptions you can filter by keyword. Just add your keywords to the Live Tail query and click Run or press Enter. The search will be rerun with the new filter and those keywords will be highlighted on incoming messages, making easy to spot conditions. The screen clears, and new results automatically scroll. Figure 2. Using Keyword Highlighting to quickly locate items in the log stream. To highlight keywords that appear in your running Live Tail, click the A button. A dialog will open — enter the term you’d like to highlight. You may enter multi-term keywords separated by spaces. Hit enter to add additional keywords. The different keywords are then highlighted using different colors, so that they are easy to find on the screen. You can highlight up to eight keywords at a time. Multi-tailing A single log file doesn’t always give you a full view. Using the multi-tail feature, you can tail multiple logs simultaneously. For example, after a database reboot, you can check if it was successful by validating that the application is querying the database. But if there’s an error on one server, you’ll need to check the other servers to see if they may be affected. You can start a second Live Tail session from the Live Tail page, or from the Search page, and the browser opens in split-screen mode, and streams 300 – 400 messages per minute. You can also open, or “pop out” a running Live Tail session into a new browser window. This way, you can move the new window to another screen, or watch it separately from the browser window where Sumo Logic is running. Figure 3. Multi-tailing in split screen mode Launch In Context One of the highlights of Sumo Logic Live Tail is the ability to launch in context, which allows you to seamlessly alternate between Sumo Logic Search and Live Tail in browser mode. For example, when you are on the search page and need to start tailing a log file to view the most recent log files coming in (raw log lines), you click on a button to launch the Live Tail page from Search and the source name gets carried forward automatically. If you are looking to perform more advanced operations like parsing, using operators or increasing the time range for the previous day, simply click “Open in Search”. This action launches a new search tab which automatically includes the parameters you entered on the Live Tail page. There is no delay to re-enter the parameters. For more information about using Live Tail, check out the documentation in Sumo Logic Help.

Blog

Open Source Projects at Sumo Logic

Someone recently asked me, rather smugly I might add, “who’s ever made money from open source?” At the time I naively answered with the first person who came to mind, which was Rod Johnson, the creator of Java’s Spring Framework. My mind quickly began retrieving other examples, but in the process I began to wonder about the motivation behind the question. The inference, of course, was that open source is free. Such a sentiment speaks not only to monetization but to the premise of open source, which raises a good many questions. As Karim R. Lakhani and Robert G Wolf wrote, “Many are puzzled by what appears to be irrational and altruistic behavior… giving code away, revealing proprietary information, and helping strangers solve their technical problems.” While many thought that better jobs, career advancement, and so on are the main drivers, Lakhani and Wolf discovered it is how creative a person feels when working on the project (what they call “enjoyment-based intrinsic motivation”) is the strongest and most pervasive driver. They also found that user need, intellectual stimulation derived from writing code, and improving programming skills are top motivators for project participation. Open Source Projects at Sumo Logic Here at Sumo Logic, we have some very talented developers on the engineering team and they are passionate about both the Sumo Logic application and giving back. To showcase some of the open-source projects our developers are working on, as well as other commits from our community we’ve created a gallery on our developer site where you can quickly browse projects and dive into the repos, code, and gists we’ve committed. Here’s a sampling of what you’ll find: Sumoshell Parsing out fields on the command line can be cumbersome. Aggregating is basically impossible, and there is no good way to view the results. Written by Russell Cohen, Sumoshell is collection of CLI utilities written in Go that you can use to improve analyzing log files. Grep can’t tell that some log lines span multiple individual lines. In Sumoshell, each individual command acts as a phase in a pipeline to get the answer you want. Sumoshell brings a lot of the functionality of Sumo Logic to the command line. Sumobot As our Chief Architect, Stefan Zier, explains in this blog post, all changes to production environments at Sumo Logic follow a well-documented change management process. In the past, we manually tied together JIRA and Slack to get from a proposal to approved change in the most expedient manner. So we built a plugin for our sumobot Slack bot. Check out both the post and the plugin. Sumo Logic Python SDK Written by Yoway Buorn, the SDK provides a Python interface to the Sumo Logic REST API. The idea is to make it easier to hit the API in Python code. Feel free to add your scripts and programs to the scripts folder. Sumo Logic Java Client Sumo Logic provides a cloud-based log management solution. It can process and analyze log files in petabyte scale. This library provides a Java client to execute searches on the data collected by the Sumo Logic service. Growing Number of Projects Machine data and analytics is about more than just server logging and aggregation. There are some interesting problems yet to be solved. Currently, you’ll find numerous appenders for .Net and Log4j, search utilities for Ruby and Java, Chef Cookbooks, and more. We could additional examples calling our REST API’s from different languages. As we build our developer community, we’d like to invite you contribute. Check out the open-source projects landing page and browse through the projects. Feel free to fork a project and share, or add examples to folders where indicated.

Blog

DevOps Visibility - Monitor, Track, Troubleshoot

As organizations embrace the DevOps approach to application development they face new challenges that can’t be met with legacy monitoring tools. Teams need DevOps Visibility. While continuous integration, automated testing and continuous delivery have greatly improved the quality of software, clean code doesn’t mean software always behaves as expected. A faulty algorithm or failure to account for unforeseen conditions can cause software to behave unpredictably. Within the continuous delivery (CD) pipeline, troubleshooting can be difficult, and in cases like debugging in a production environment it may not even be possible.DevOps teams are challenged with monitoring, tracking and troubleshooting issues in a context where applications, systems, network, and tools across the toolchain all emit their own logging data. In fact, we are generating an ever-increasing variety, velocity, and volume of data.Challenges of Frequent Release CyclesThe mantra of DevOps is to "Release faster and automate more." But these goals can also become pain points. Frequent release introduces new complexity and automation obscures that complexity. In fact, DevOps teams cite deployment complexity as their #1 challenge.The current challenges for DevOps teams is:Difficulty in collaborating across silos. Difficulty syncing multiple development work-streams. Frequent performance or availability issues. No predictive analytics to project future KPI violations. No proactive push notifications to alert on service outages.DevOps drives cross-organizational team collaboration. However, organizations amidst a DevOps adoption are finding they are having difficulty in collaborating across silos. Frequent release cycles also adds pressure when it comes to syncing multiple development work-streams. These forces are driving the need for more integration between existing legacy tools, and the need for new tools that cross-organizational teams can use collaboratively.Because of its emphasis on automated testing, DevOps has also created a need for toolsets that enable troubleshooting and root-cause analysis. Why? Because, as I've said, clean code doesn’t mean software always behaves as expected. That's why a greatest pain point for many of these teams is additions and modifications to packaged applications - often these are deployed to multi-tenant cloud environments.Troubleshooting from the Command LineDevOps teams are discovering that performance and availability problems have increased with more frequent releases. That means Ops is spending more time troubleshooting, and development is being drawn into production troubleshooting. In response developers typically will ssh into a server or cloud environment, drop down to the command line, and tail -f the log file. When the problem isn't readily seen they begin grepping the logs using regular expressions and hunt for patterns and clues to the problem. But grep doesn't scale. Simply put, log data is everywhere. Application, system and network logs are stored in different locations of each server, and may be distributed across locations in the cloud or other servers. Sifting through terabytes of data can take days.The difficulty is there's no consistency, no centralization and no visibility—No Consistency Ops is spending more time troubleshooting. Development is drawn into production troubleshooting. Service levels have degraded with more frequent releases. Performance and availability problems have increased. No CentralizationMany locations of various logs on each server. Logs are distributed across locations in the cloud or various servers. SSH + GREP doesn’t scale. No DevOps VisibilityHigh-value data is buried in petabytes Meaningful views are difficult to assemble No real-time visibility Immense size of Log DataDevOps Visibility Across the Tool ChainSumo Logic provides a single solution that is tool-agnostic and provides visibility throughout the Continuous Integration-Continuous Delivery pipeline, as well as across the entire DevOps toolchain. Sumo Logic delivers a comprehensive strategy for monitoring, tracking and troubleshooting applications at every stage of the build, test, deliver, and deploy release cycle.Full Stack DevOps Visibility - gather event streams from applications at every stage from sandbox development to final deployment and beyond. Combine with system and infrastructure data to get a complete view of your application and infrastructure stack in real time. No integration hassles - Sumo Logic can be integrated with a host of DevOps tools across the entire continuous delivery pipeline, not just server data. Increased Availability and Performance - Because you can monitor deployments in real time, issues can be identified before they impact the application and customer. Precise, proactive analytics quickly uncover hidden root causes across all layers of the application and infrastructure stack. Streamlines Continuous Delivery Troubleshoot issues and set alerts on abnormal container or application behavior Visualizations of key metrics and KPIs, including image usage, container actions and faults, as well as CPU/Memory/Network statistics Ability to easily create custom and aggregate KPIs and metrics using Sumo Logic’s powerful query language Advanced analytics powered by Log Reduce, Anomaly Detection, Transaction Analytics, and Outlier DetectionVersatilityThe one reaction I hear from customers is surprise - An organization will typically apply Sumo Logic to a specific use case such as security compliance. Then they discover the breadth of the product and apply it to use cases they had never thought of.“Many benefits and features of Sumo Logic came to us as a surprise. The Sumo Logic Service continues to uncover different critical issues and deliver new insight throughout the development/support lifecycles of each new version we release” -- Nathan Smith, Technical Director, Outsmart GamesSumo Logic enables DevOps teams to get deep, real-time visibility into their entire toolchain and production environment to help create better software faster. You can check out Sumo Logic right now with a free trial. It's easy to set up and allows you check out the wealth of features including LogReduce, our pattern-matching algorithm that quickly detects anomalies, errors and trending patterns in your data.

Blog

New Heroku Add-on for Sumo Logic Goes Beta

Today, Sumo Logic is pleased to announce that it is partnering with Heroku to bring a new level of real-time visibility to Heroku logs. Now Heroku developers will be able to select the Sumo Logic add-on directly from the Heroku marketplace, and quickly connect application, system and event logs to the Sumo Logic service with just a few clicks. Developers can then launch the Sumo Logic service directly from their Heroku Dashboard to gain real-time access to event logs in order to monitor new deployments, troubleshoot applications, and uncover performance issues. They can take advantage of Sumo Logic’s powerful search language to quickly search unstructured log data and isolate the application node, module or library where the root cause of a problem hides. Developers can also utilize patent-pending LogReduce™ to reduce hundreds of thousands of log events down to groups of patterns while filtering out the noise in your data. LogReduce can help reduce the Mean Time to Identification (MTTI) of issues by 50% or more. Developers also have access to Outlier & Anomaly Detection. Often, analysis and troubleshooting is centered around known data in systems. However, most errors, and security breaches stem from unknown data, or data that is new to a system. Analyzing this data requires highly-scalable infrastructure, and advanced algorithms to process the data. This is what Sumo Logic enables with its Anomaly Detection feature. Extending Heroku’s Logplex Heroku is a polyglot Platform-as-a-Service (PaaS) that allows developers to build applications locally, then push changes via Github up to Heroku for deployment. Heroku provides a managed container environment that supports popular development stacks including Java, Ruby, Scala, Clojure, Node.js, PHP, Python and Go. For logging, Heroku provides a service called Logplex that captures events and outputs streams of your app’s running processes, system components and other relevant platform-level events, and routes them into a single channel. Logplex aggregates log output from your application (including logs generated from an application server and libraries), system logs (such as restarting a crashed process), and API logs (e.g., deploying new code). The caveat is that Heroku only stores the last 1,500 lines of consolidated logs. To get Sumo Logic’s comprehensive logging with advanced search, pattern matching, outlier detection, and anomaly detection, you previously had to create a Heroku log drain – network service that can consume your app’s logs. Then you could configure an HTTPS service for sending logs to Sumo Logic. Seamless UX The Heroku add-on simplifies this process while providing developers with a seamless experience from the Heroku Dashboard. Now, with the Heroku add-on for Sumo Logic, you simply push your changes up to Heroku, then run the following on the command line to create your app: heroku addons:create sumologic –app <my_app_name> This creates the application on Heroku, configures the app and points the log drain to the Sumo Logic service for you automatically. To view your logs, simply go to your Heroku dashboard and launch the Sumo Logic add-on. From the Heroku Dashboard, simply click on the Sumo Logic add-on. That will open Sumo Logic. Heroku Add-on for Sumo Logic Quick Start I’ve created a quick start that shows you how to build a simple Ruby app on Heroku, install the Sumo Logic add-on, and connect your new app to the Sumo Logic service. You can use Sumo Free to test your configuration, and you can run through the entire quick start in 15 minutes. About the Author Michael is the Head of Developer Programs at Sumo Logic. You can follow him on Twitter @CodeJournalist or LinkedIn.

Blog

Heroku Add-on for Sumo Logic Quick Start

Blog

New DevOps Community Enables Continuous Delivery Practitioners

According to the Puppet Labs’ 2015 State of DevOps Report high-performing IT organizations deploy 30 times more frequently with 200 times shorter lead times; they have 60 times fewer failures and recover 168 times faster. While staggering, those numbers should also lend credence to the fact that DevOps, however you spell it, is working.

Blog

Deploying “Hello, World!” DevOps Style

Blog

Change Management in a Change-Dominated World

DevOps isn't just about change -- it's about continuous, automated change. It's about ongoing stakeholder input and shifting requirements;about rapid response and fluid priorities. In such a change-dominated world, how can the concept of change management mean anything? But maybe that's the wrong question. Maybe a better question would be this: Can a change-dominated world even exist without some kind of built-in change management? Change management is always an attempt to impose orderly processes on disorder. That, at least, doesn't change. What does change is the nature and the scope of the disorder, and the nature and the scope of the processes that must be imposed on it. This is what makes the DevOps world look so different, and appear to be so alien to any kind of recognizable change management. Traditional change management, after all, seems inseparable from waterfall and other traditional development methodologies. You determine which changes will be part of a project, you schedule them, and there they are on a Gantt chart, each one following its predecessor in proper order. Your job is as much to keep out ad-hoc chaos as it is to manage the changes in the project. And in many ways, Agile change management is a more fluid and responsive version of traditional change management, scaled down from project-level to iteration-level, with a shifting stack of priorities replacing the Gantt chart. Change management's role is to determine if and when there is a reason why a task should move higher or lower in the priority stack, but not to freeze priorities (as would have happened in the initial stages of a waterfall project). Agile change management is priority management as much as it is change management -- but it still serves as a barrier against the disorder of ad-hoc decision-making. In Agile, the actual processes involved in managing changes and priorities are still in human hands and are based on human decisions. DevOps moves many of those management processes out of human hands and places them under automated control. Is it still possible to manage changes or even maintain control over priorities in an environment where much of the on-the-ground decision-making is automated? Consider what automation actually is in DevOps -- it's the transfer of human management policies, decision-making, and functional processes to an automatically operating computer-based system. You move the responsibilities that can be implemented in an algorithm over to the automated system, leaving the DevOps team free to deal with the items that need actual, hands-on human attention. This immediately suggests what naturally tends to happen with change management in DevOps. It splits into two forks, each of which is important to the overall DevOps effort. One fork consists of change management as implemented in the automates continuous release system, while the other fork consists of human-directed change management of the somewhat more traditional kind. Each of these requires first-rate change management expertise on an ongoing basis. It isn't hard to see why an automated continuous release system that incorporates change management features would require the involvement of human change management experts during its initial design and implementation phases. Since the release system is supposed to incorporate human expertise, it naturally needs expert input at some point during its design. Input from experienced change managers (particularly those with a good understanding of the system being developed) can be extremely important during the early design phases of an automated continuous release system; you are in effect building their knowledge into the structure of the system. But DevOps continuous release is by its very nature likely to be a continually changing process itself, which means that the automation software that directs it is going to be in a continual state of change. This continual flux will include the expertise that is embodied in the system, which means that its frequent revision and redesign will require input from human change management experts. And not all management duties can be automated. After human managers have been relieved of all of the responsibilities that can be automated, they are left with the ones that for one reason or another do not lend themselves well to automation -- in essence, anything that can't be easily turned into an algorithm. This is likely to include at least some (and possibly many) of the kinds of decision that fall under the heading of change management. These unautomated responsibilities will require someone (or several people) to take the role of change manager. And DevOps change management generally does not take its cue from waterfall in the first place. It is more likely to be a lineal descendant of Agile change management, with its emphasis on managing a flexible stack of priorities during the course of an iteration, and not a static list of requirements that must be included in the project. This kind of priority-balancing requires more human involvement than does waterfall's static list, which means that Agile-style change management is likely to result in a greater degree of unautomated change management than one would find with waterfall. This shouldn't be surprising. As the more repetitive, time-consuming, and generally uninteresting tasks in any system are automated, it leaves greater time for complex and demanding tasks involving analysis and decision-making. This in turn make it easier to implement methodologies which might not be practical in a less automated environment. In other words, human-based change management will now focus on managing shifting priorities and stakeholder demands, not because it has to, but because it can. So what place does change management have in a change-dominated world? It transforms itself from being a relatively static discipline imposed on an inherently slow process (waterfall development) to an intrinsic (and dynamic) part of the change-driven environment itself. DevOps change management manages change from within the machinery of the system itself, while at the same time allowing greater latitude for human guidance of the flow of change in response to the shifting requirements imposed by that change-driven environment. To manage change in a change-dominated world, one becomes the change.

Michael Floyd

You're in good company