William Cohen

Github

William Cohen has been a developer of performance tools at Red Hat for over a decade and has worked on a number of the performance tools in Red Hat Enterprise Linux and Fedora such as OProfile, PAPI, SystemTap, and Dyninst.

William Cohen's contributions

Many developers would like to run their existing applications in a container with restricted capabilities to improve security. However, it may not be clear which capabilities the application uses because the code uses libraries or other code developed elsewhere. The developer could run the application in an unrestricted container that allows all syscalls and capabilities to be used to avoid possible hard to diagnose failures caused by the application's use of forbidden capabilities or syscalls. Of course, this eliminates the...

Maybe you have so much memory in your computer that you never have to worry about it --- then again, maybe you find that some C or C++ application is using more memory than expected. This could be preventing you from running as many containers on a single system as you expected, it could be causing performance bottlenecks, and it could even be forcing you to pay for more memory in your servers. You do some quick "back of the...

No one wants the hardware in their computer sitting idle - we all want to get as much useful work out of our hardware as possible. Mechanisms such as cache and branch prediction have been incorporated into processors to minimize the amount of processor idle time caused by memory accesses and changes in program flow; however, these mechanism are not perfect. There are still times that the processor could be idle waiting for data or computational results to become available...

The classic 1984 movie Ghostbusters offered an important safety tip for all of us: " Don't cross the streams." - "Why not?" - "I t would be bad." - " I’m fuzzy on the whole good/bad thing. What do you mean, 'bad'?" - "Try to imagine all life as you know it stopping instantaneously and every molecule in your body exploding at the speed of light." - "Right. That’s bad. Okay. All right. Important safety tip. Thanks..." Similarly, in computing...

In the traditional processor pipeline model under ideal circumstances one new instruction enters the processor's and one instruction completes execution each cycle. Thus, for the best case the processor can have an average execution rate of one clock per instruction. A superscalar processor allows multiple unrelated instructions to start on the same clock cycle on separate hardware units or pipelines. Under ideal conditions a superscalar processors could have an average clocks per instruction (CPI) be less one, meaning your 2GHz...

A pipelined processor requires a steady stream of instructions to be fed into the pipeline. Any delay in feeding instructions into the pipeline will hurt performance. For a sequence of instructions without branches it is relatively easy to determine the next instruction to feed into the pipeline, particularly for processors with fixed sized instructions. Variable-sized instructions might complicate finding the start of each instruction, but it is still a contiguous, linear stream of bytes from memory. Keeping the processor pipeline...

The simple programmer's model of a processor executing machine language instructions is a loop of the following steps with each step finished before moving on the the next step: Fetch instruction Decode instruction and fetch register operands Execute arithmetic computation Possible memory access (read or write) Writeback results to register As mentioned in the introduction blog article even if the processor can get each step down to a single cycle that would would be 2.5ns (5*0.5ns) for a 2GHz (2x10^9...

Article

The simple programmer's model of processor executing machine language instruction is a loop of the following steps each step finished before moving on the the next step: Fetch instruction Decode instruction and fetch register operands Execute arithmetic computation Possible memory access (read or write) Writeback results to register At a minimum it takes one processor clock cycle to do each step. However, for steps 1 and 4 accessing main memory may take much longer than one cycle. Modern processors typically...

Linux

Java runtimes & frameworks

Kubernetes

Integration & App Connectivity

AI/ML

Automation

Developer tools

Developer Sandbox

Programming Languages & Frameworks

System Design & Architecture

Developer Productivity

Secure Development & Architectures

Platform Engineering

Automated Data Processing

Start exploring in the Developer Sandbox for free

Interactive Lessons and Learning Paths

Developer Sandbox Activities

E-Books

Tutorials

Cheat Sheets

Documentation

Red Hat Learning

William Cohen

William Cohen's contributions

Find what capabilities an application requires to successful run in a container

How to avoid wasting megabytes of memory a few bytes at a time

Instruction-level Multithreading to improve processor utilization

"Don't cross the streams": Thread safety and memory accesses at the speed of light

Superscalar Execution

Quickly determine which instruction comes next with Branch Prediction

Assembly Line for Computations

Reducing Memory Access Times with Caches

Products

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue