Pricing Login
Pricing
Chris Tozzi

Chris Tozzi

Chris Tozzi has worked as a journalist and Linux systems administrator. He has particular interests in open source, agile infrastructure, and networking. He is Senior Editor of content and a DevOps Analyst at Fixate IO. His latest book, For Fun and Profit: A History of the Free and Open Source Software Revolution, was published in 2017.

Posts by Chris Tozzi

Blog

Top six Amazon S3 metrics to monitor

Blog

Distributed tracing vs. application monitoring

Blog

Explore NGINX usage, performance, and transactions to increase customer experience

Blog

Microservices vs. serverless architecture

Blog

Efficiently monitor the state of Redis database clusters

Blog

Observability vs. monitoring: what's the difference?

Blog

How to monitor Amazon Aurora RDS logs and metrics

Blog

How to Monitor Amazon Redshift

Blog

Monitoring Microsoft SQL Best Practices

Blog

The 7 Essential Metrics for Amazon EC2 Monitoring

Blog

7 Key DevOps Principles

Blog

What is Blockchain, Anyway? And What Are the Biggest Use Cases?

Everyone’s talking about blockchain these days. In fact, there is so much hype about blockchains — and there are so many grand ideas related to them — that it’s hard not to wonder whether everyone who is excited about blockchains understands what a blockchain actually is. If, amidst all this blockchain hype, you’re asking yourself “what is blockchain, anyway?” then this article is for you. It defines what blockchain is and explains what it can and can’t do. Blockchain Is a Database Architecture In the most basic sense, blockchain is a particular database architecture. In other words, like any other type of database architecture (relational databases, NoSQL and the like), a blockchain is a way to structure and store digital information. (The caveat to note here is that some blockchains now make it possible to distribute compute resources in addition to data. For more on that, see below.) What Makes Blockchain Special? If blockchain is just another type of database, why are people so excited about it? The reason is that a blockchain has special features that other types of database architectures lack. They include: Maximum data distribution. On a blockchain, data is distributed across hundreds of thousands of nodes. While other types of databases are sometimes deployed using clusters of multiple servers, this is not a strict requirement. A blockchain by definition involves a widely distributed network of nodes for hosting data. Decentralization. Each of the nodes on a blockchain is controlled by a separate party. As a result, the blockchain database as a whole is decentralized. No single person or group controls it, and no single group or person can modify it. Instead, changes to the data require network consensus. Immutability. In most cases, the protocols that define how you can read and write data to a blockchain make it impossible to erase or modify data once it has been written. As a result, data stored on a blockchain is immutable. You can add data, but you can’t change what already exists. (We should note that while data immutability is a feature of the major blockchains that have been created to date, it’s not strictly the case that blockchain data is always immutable.) Beyond Data As blockchains have evolved over the past few years, some blockchain architectures have grown to include more than a way to distribute data across a decentralized network. They also make it possible to share compute resources. The Ethereum blockchain does this, for example, although Bitcoin—the first and best-known blockchain—was designed only for recording data, not sharing compute resources. If your blockchain provides access to compute resources as well as data, it becomes possible to execute code directly on the blockchain. In that case, the blockchain starts to look more like a decentralized computer than just a decentralized database. Blockchains and Smart Contracts Another buzzword that comes up frequently when discussing what defines a blockchain is a smart contract. A smart contract is code that causes a specific action to happen automatically when a certain condition is met. The code is executed on the blockchain, and the results are recorded there. This may not sound very innovative, but there are some key benefits and use cases. Any application could incorporate code that makes a certain outcome conditional upon a certain circumstance. If-this-then-that code stanzas are not really a big deal. What makes a smart contract different from a typical software conditional statement, however, is that because the smart contract is executed on a decentralized network of computers, no one can modify its outcomes. This feature differentiates smart contracts from conditional statements in traditional applications, where the application is controlled by a single, central authority, which has the power to modify it. Smart contracts are useful for governing things like payment transactions. If you want to ensure that a seller does not receive payment for an item until the buyer receives the item, you could write a smart contract to make that happen automatically, without relying on third-party oversight. Limitations of Blockchains By enabling complete data decentralization and smart contracts, blockchains make it possible to do a lot of interesting things that you could not do with traditional infrastructure. However, it’s important to note that blockchains are not magic. Most blockchains currently have several notable limitations. Transactions are not instantaneous. Bitcoin transactions take surprisingly long to complete, for example. Access control is complicated. On most blockchains, all data is publicly accessible. There are ways to limit access control, but they are complex. In general, a blockchain is not a good solution if you require sophisticated access control for your data. Security. While blockchain is considered a secure place for transactions and storing/sending sensitive data and information, there have been a few blockchain-related security breaches. Moving your data to a blockchain does provide an inherent layer of protection because of the decentralization and encryption features, however, like most things, it does not guarantee that it won’t be hacked or exploited. Additional Resources Watch the latest SnapSecChat videos to hear what our CSO, George Gerchow, has to say about data privacy and the demand for security as a service. Read a blog on new Sumo Logic research that reveals why a new approach to security in the cloud is required for today’s modern businesses. Learn what three security dragons organizations must slay to achieve threat discovery and investigation in the cloud.

Blog

How Log Analysis Has Evolved

Blog

6 Metrics You Should Monitor During the Application Build Cycle

Monitoring application metrics and other telemetry from production environments is important for keeping your app stable and healthy. That you know. But app telemetry shouldn’t start and end with production. Monitoring telemetry during builds is also important for application quality. It helps you detect problems earlier on, before they reach production. It also allows you to achieve continuous, comprehensive visibility into your app. Below, we’ll take a look at why monitoring app telemetry during builds is important, then discuss the specific types of data you should collect at build time. App Telemetry During Builds By monitoring application telemetry during the build stage of your continuous delivery pipeline, you can achieve the following: Early detection of problems. Telemetry statistics collected during builds can help you to identify issues with your delivery chain early on. For example, if the number of compiler warnings is increasing, it could signal a problem with your coding process. You want to address that before your code gets into production. Environment-specific visibility. Since you usually perform builds for specific types of deployment environments, app telemetry from the builds can help you to gain insight into the way your app will perform within each type of environment. Here again, data from the builds helps you find potential problems before your code gets to production. Code-specific statistics. App telemetry data from a production environment is very different from build telemetry. That’s because the nature of the app being studied is different. Production telemetry focuses on metrics like bandwidth and active connections. Build telemetry gives you more visibility into your app itself—how many internal functions you have, how quickly your code can be compiled, and so on. Continuous visibility. Because app telemetry from builds gives you visibility that other types of telemetry can’t provide, it’s an essential ingredient for achieving continuous visibility into your delivery chain. Combined with monitoring metrics from other stages of delivery, build telemetry allows you to understand your app in a comprehensive way, rather than only monitoring it in production. Metrics to Collect If you’ve read this far, you know the why of build telemetry. Now let’s talk about the how. Specifically, let’s take a look at which types of metrics to focus on when monitoring app telemetry during the build stage of your continuous delivery pipeline. Number of environments you’re building for. This might seem so basic that it’s not worth monitoring. But in a complex continuous delivery workflow, it’s possible that the types of environments you target will change frequently. Tracking the total number of environments can help you understand the complexity of your build process. It can also help you measure your efforts to stay agile by maintaining the ability to add or subtract target environments quickly. Total lines of source code. This metric gives you a sense of how quickly your application is growing—and by extension, how many resources it will consume, and how long build times should take. The correlation between lines of source code and these factors is rough, of course. But it’s still a useful metric to track. Build times. Monitoring how long builds take, and how build times vary between different target environments is another way to get a sense of how quickly your app is growing. It’s also important for keeping your continuous delivery pipeline flowing smoothly. Code builds are often the most time-consuming process in a continuous delivery chain. If build times start increasing substantially, you should address them in order to avoid delays that could break your ability to deliver continuously. Compiler warnings and errors. Compiler issues are often an early sign of software architecture or coding issues. Even if you are able to work through the errors and warnings that your compiler throws, monitoring their frequency gives you an early warning sign of problems with your app. Build failure rate. This metric serves as another proxy for potential architecture or coding problems. Code load time. Measuring changes in the time it takes to check out code from the repository where you store it helps you prevent obstacles that could hamper continuous delivery. Monitoring telemetry during the build stage of your pipeline by focusing on the metrics outlined above helps you not only build more reliably, but also gain insights that make it easier to keep your overall continuous delivery chain operating smoothly. Most importantly, they help keep your app stable and efficient by assisting you in detecting problems early and maximizing your understanding of your application.

Blog

Getting the Most Out of SaltStack Logs

SaltStack, also known simply as Salt, is a handy configuration management platform. Written in Python, it’s open source and allows ITOps teams to define “Infrastructure as Code” in order to provision and orchestrate servers. But SaltStack’s usefulness is not limited to configuration management. The platform also generates logs, and like all logs, that data can be a useful source of insight in all manner of ways. This article provides an overview of SaltStack logging, as well as a primer on how to analyze SaltStack logs with Sumo Logic. Where does SaltStack store logs? The first thing to understand is where SaltStack logs live. The answer to that question depends on where you choose to place them. You can set the log location by editing your SaltStack configuration file on the salt-master. By default, this file should be located at /etc/salt/master on most Unix-like systems. The variable you’ll want to edit is log_file. If you want to store logs locally on the salt-master, you can simply set this to any location on the local file system, such as /var/log/salt/salt_master. Storing Salt logs with rsyslogd If you want to centralize logging across a cluster, however, you will benefit by using rsyslogd, a system logging tool for Unix-like systems. With rsyslogd, you can configure SaltStack to store logs either remotely or on the local file system. For remote logging, set the log_file parameter in the salt-master configuration file according to the format: <file|udp|tcp>://<host|socketpath>:/. For example, to connect to a server named mylogserver (whose name should be resolveable on your local network DNS, of course) via UDP on port 2099, you’d use a line like this one: log_file: udp://mylogserver:2099 Colorizing and bracketing your Salt logs Another useful configuration option that SaltStack supports is custom colorization of console logs. This can make it easier to read the logs by separating high-priority events from less important ones. To set colorization, you change the log_fmt_console parameter in the Salt configuration file. The colorization options available are: '%(colorlevel)s' # log level name colorized by level '%(colorname)s' # colorized module name '%(colorprocess)s' # colorized process number '%(colormsg)s' # log message colorized by level Log files can’t be colorized. That would not be as useful, since the program you use to read the log file may not support color output, but they can be padded and bracketed to distinguish different event levels. The parameter you’ll set here is log_fmt_logfile and the options supported include: '%(bracketlevel)s' # equivalent to [%(levelname)-8s] '%(bracketname)s' # equivalent to [%(name)-17s] '%(bracketprocess)s' # equivalent to [%(process)5s] How to Analyze SaltStack logs with Sumo Logic So far, we’ve covered some handy things to know about configuring SaltStack logs. You’re likely also interested in how you can analyze the data in those logs. Here, Sumo Logic, which offers easy integration with SaltStack, is an excellent solution. Sumo Logic has an official SaltStack formula, which is available from GitHub. To install it, you can use GitFS to make the formula available to your system, but the simpler approach (for my money, at least) is simply to clone the formula repository in order to save it locally. That way, changes to the formula won’t break your configuration. (The downside, of course, is that you also won’t automatically get updates to the formula, but you can always update your local clone of the repository if you want them.) To set up the Sumo Logic formula, run these commands: mkdir -p /srv/formulas # or wherever you want to save the formula cd /srv/formulas git clone https://github.com/saltstack-formulas/sumo-logic-formula.git Then simply edit your configuration by adding the new directory to the file_roots parameter, like so: file_roots: base: - /srv/salt - /srv/formulas/sumo-logic-formula Restart your salt-master and you’re all set. You’ll now be able to analyze your SaltStack logs from Sumo Logic, along with any other logs you work with through the platform. Getting the Most Out of SaltStack Logs is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production. About the Author Chris Tozzi has worked as a journalist and Linux systems administrator. He has particular interests in open source, agile infrastructure and networking. He is Senior Editor of content and a DevOps Analyst at Fixate IO.

Blog

Using Logs to Speed Your DevOps Workflow

Blog

5 Bintray Security Best Practices

Bintray, JFrog’s software hosting and distribution platform, offers lots of exciting features, like CI integration and REST APIs. If you’re like me, you enjoy thinking about those features much more than you enjoy thinking about software security. Packaging and distributing software is fun; worrying about the details of Bintray security configurations and access control for your software tends to be tedious (unless security is your thing, of course). Like any other tool, however, Bintray is only effective in a production environment when it is run securely. That means that, alongside all of the other fun things you can do with Bintray, you should plan and run your deployment in a way that mitigates the risk of unauthorized access, the exposure of private data, and so on. Below, I explain the basics of Bintray security, and outline strategies for making your Bintray deployment more secure. Bintray Security Basics Bintray is a cloud service hosted by JFrog’s data center provider. JFrog promises that the service is designed for security, and hardened against attack. (The company is not very specific about how it mitigates security vulnerabilities for Bintray hosting, but I wouldn’t be either, since one does not want to give potential attackers information about the configuration.) JFrog also says that it restricts employee access to Bintray servers and uses SSH over VPN when employees do access the servers, which adds additional security. The hosted nature of Bintray means that none of the security considerations associated with on-premises software apply. That makes life considerably easier from the get-go if you’re using Bintray and are worried about security. Still, there’s more that you can do to ensure that your Bintray deployment is as robust as possible against potential intrusions. In particular, consider adopting the following policies. Set up an API key for Bintray Bintray requires users to create a username and password when they first set up an account. You’ll need those when getting started with Bintray. Once your account is created, however, you can help mitigate the risk of unauthorized access by creating an API key. This allows you to authenticate over the Bintray API without using your username or password. That means that even if a network sniffer is listening to your traffic, your account won’t be compromised. Use OAuth for Bintray Authentication Bintray also supports authentication using the OAuth protocol. That means you can log in using credentials from a GitHub, Twitter or Google+ account. Chances are that you pay closer attention to one of these accounts (and get notices from the providers about unauthorized access) than you do to your Bintray account. So, to maximize security and reduce the risk of unauthorized access, make sure your Bintray account itself has login credentials that cannot be brute-forced, then log in to Bintray via OAuth using an account from a third-party service that you monitor closely. Sign Packages with GPG Bintray supports optional GPG signing of packages. To do this, you first have to configure a key pair in your Bintray profile. For details, check out the Bintray documentation. GPG signing is another obvious way to help keep your Bintray deployment more secure. It also keeps the users of your software distributions happier, since they will know that your packages are GPG-signed, and therefore, are less likely to contain malicious content. Take Advantage of Bintray’s Access Control The professional version of Bintray offers granular control over who can download packages. (Unfortunately this feature is only available in that edition.) You can configure access on a per-user or per-organization basis. While gaining Bintray security shouldn’t be the main reason you use granular access control (the feature is primarily designed to help you fine-tune your software distribution), it doesn’t hurt to take advantage of it in order to reduce the risk that certain software becomes available to a user to whom you don’t want to give access. 5 Bintray Security Best Practices is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production. About the Author Chris Tozzi has worked as a journalist and Linux systems administrator. He has particular interests in open source, agile infrastructure and networking. He is Senior Editor of content and a DevOps Analyst at Fixate IO.

September 9, 2016

Blog

Tutorial: How to Run Artifactory as a Container

Blog

Solaris Containers: What You Need to Know

Blog

A Beginner’s Guide to GitHub Events

Do you like GitHub, but don’t like having to log in to check on the status of your project or code? GitHub events are your solution. GitHub events provide a handy way to receive automated status updates from your GitHub repos concerning everything from code commits to new users joining a project. And because they are accessible via a Web API as GET requests, it’s easy to integrate them into the notification system of your choosing. Keep reading for a primer on GitHub events and how to get the most out of them. What GitHub events are, and what they are not Again, GitHub events provide an easy way to keep track of your GitHub repository without monitoring its status manually. They’re basically a notification system that offers a high level of customizability. You should keep in mind, however, that GitHub events are designed only as a way to receive notifications. They don’t allow you to interact with your GitHub repo. You can’t trigger events; you can only receive notifications when specific events occur. That means that events are not a way for you to automate the maintenance of your repository or project. You’ll need other tools for that. But if you just want to monitor changes, they’re a simple solution. How to use GitHub events GitHub event usage is pretty straightforward. You simply send GET requests to https://api.github.com. You specify the type of information you want by completing the URL information accordingly. For example, if you want information about the public events performed by a given GitHub user, you would send a GET request to this URL: https://api.github.com/users//events (If you are authenticated, this request will generate information about private events that you have performed.) Here’s a real-world example, in which we send a GET request using curl to find information about public events performed by Linus Torvalds (the original author of Git), whose username is torvalds: curl -i -H "Accept: application/json" -H "Content-Type: application/json" -X GET https://api.github.com/users/torvalds/events Another handy request lets you list events for a particular organization. The URL to use here looks like: https://api.github.com/users/:username/events/orgs/ The full list of events, with their associated URLs, is available from the GitHub documentation. Use GitHub Webhooks for automated events reporting So far, we’ve covered how to request information about an event using a specific HTTP request. But you can take things further by using GitHub Webhooks to automate reporting about events of a certain type. Webhooks allow you to “subscribe” to particular events and receive an HTTP POST response (or, in GitHub parlance, a “payload”) to a URL of your choosing whenever that event occurs. You can create a Webhook in the GitHub Web interface that allows you to specify the URL to which GitHub should send your payload when an event is triggered. Alternatively, you can create Webhooks via the GitHub API using POST requests. However you set them up, Webhooks allow you to monitor your repositories (or any public repositories) and receive alerts in an automated fashion. Like most good things in life, Webhooks are subject to certain limitations, which are worth noting. Specifically, you can only configure up to a maximum of twenty events per each GitHub organization or repository. Authentication and GitHub events The last bit of information we should go over is how to authenticate with the GitHub API. While you can monitor public events without authentication, you’ll need to authenticate in order to keep track of private ones. Authentication via the GitHub API is detailed here, but it basically boils down to having three options. The simplest is to do HTTP authentication using a command like: curl -u "username" https://api.github.com If you want to be more sophisticated, you can also authenticate using OAuth2 via either key/secrets or tokens. For example, authenticating with a token would look something like: curl https://api.github.com/?access_token=OAUTH-TOKEN If you’re monitoring private events, you’ll want to authenticate with one of these methods before sending requests about the events. Further reading If you want to dive deeper into the details of GitHub events, the following resources are useful: Overview of event types. Event payloads according to event type. Setting up Webhooks. GitHub API authentication. A Beginner’s Guide to GitHub Events is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

Blog

Docker 1.12: What You Need to Know

Blog

Using Node.js npm with Artifactory via the API and CLI

Blog

Docker Security - 6 Ways to Secure Your Docker Containers