Deliberate Thinking: joy

Showing posts with label joy. Show all posts

29 May 2019

Tuning G1 GC for Cassandra

Tuning G1 GC for Cassandra is too complicated, but it can make a big difference in cluster health.

Symptoms:

High p99 read/write latencies (because of long GC pauses)
High CPU causing lower read throughput (because of low GC throughput)
Dropped mutations (because of full GC collections on write-heavy clusters)

Here are some options that made a difference for me:

JVM: options for getting GC details out for inspection
-XX:+PrintGCDetails

-XX:+PrintGCDateStamps

-Xloggc:/var/log/cassandra/gc.log
JVM: options for having enough buffer for collections
# Pre-allocate full heap
# Pre-size new size for high-throughput young collections
-Xms24G

-Xmx24G

-Xmn8G
JVM: options for avoiding longer pauses (do reference scanning concurrently with app)
# Have the JVM do less remembered set work during STW, instead

# preferring concurrent GC.

-XX:G1RSetUpdatingPauseTimePercent=5

# Scan references in parallel to avoid long RSet scan times

-XX:+ParallelRefProcEnabled
JVM: options for better young collection throughput (avoid copying short-lived objects)
# Save CPU time by avoiding copying objects repeatedly

# Improve collection throughput by making heap regions larger

-XX:MaxTenuringThreshold=1

-XX:G1HeapRegionSize=32m
JVM: option cocktail to reduce risk of long mixed collections
# Avoid to-space exhaustion by starting sooner, capping new size, and being more aggressive during mixed collections

-XX:InitiatingHeapOccupancyPercent=40

-XX:+UnlockExperimentalVMOptions

-XX:G1MaxNewSizePercent=50

-XX:G1MixedGCLiveThresholdPercent=50

-XX:G1MixedGCCountTarget=32

-XX:G1OldCSetRegionThresholdPercent=5

# Reduce pause time target to make mixed collections shorter

-XX:MaxGCPauseMillis=300
JVM: option to get extra buffer for use in allocation emergency
# Reserve extra heap space to reduce risk of to-space overflows

-XX:G1ReservePercent=20
JVM: options for top collection throughput during pauses
# Max out the parallel effort during pause
# Set to number of cores

-XX:ParallelGCThreads=16

-XX:ConcGCThreads=16
Cassandra: option to avoid excess spikes of garbage from compaction
# Reduce load of garbage generation & CPU used for compaction
compaction_throughput_mb_per_sec: 2
Cassandra: option to aggressively flush to disk on write-heavy clusters
# Reduce amount of memtable heap load to reduce object copying
memtable_heap_space_in_mb: 1024 # instead of default 1/3 heap

The net effect of the above combined settings is as follows:

for a read-heavy cluster on i3.4xlarge:

young collection p90 pause times around 50ms
mixed collection p90 pause times around 90ms
no Full GCs, no dropped mutations

for write-heavy clusters on r5.2xlarge:

young collection p90 pause times around 175ms
mixed collection p90 pause times around 175ms
no Full GCs, no dropped mutations

Tuning process:

Turn on GC logging
Gather pause times for young collections, mixed collections, and any full collections

get logs for at least 2-3 cycles of young => mixed/full transitions

Decide which of the above you want to optimize for, pick a single set of settings

Apply the settings to one node on one rack
Decide whether it had the desired effect
Tweak and repeat on single node until you get to a stable point

Apply settings to all nodes on one rack

Wait for a peak traffic period or apply stress
Compare results from non-tuned racks with the tuned rack
Tweak and repeat on single rack until settings are rock solid

Apply settings to full cluster

Wait for a peak traffic period or apply stress
Make sure settings are rock solid for full cluster

Start again on step 2 until you have nothing left to tune

20 May 2019

With You

Piece of pie
Clear blue sky
Mountain lakes
Leaves, and rakes

And when I am with you,
Two hearts filled with joy.
Our time passes softly,
Two hearts, one girl, one boy.

Teach me love
Teach me caring
I'll teach you happiness
I'll teach you daring

And when I am with you,
Two hearts filled with joy.
Our time passes softly,
Two hearts, one girl, one boy.

18 May 2019

Finally Stable Caps to Ctrl Mapping

Over the years, I have tried so many ways on Linux to map Caps Lock to Ctrl:

Xorg config (that didn't work on a tty console)
Inputrc (really weird rules that depended on initial state)
Gnome keyboard config

Finally there is a stable way to remap Caps Lock to Ctrl. It is relatively simple and it is at the lowest level in Linux.

Thanks to this page for all the details: https://wiki.archlinux.org/index.php/Map_scancodes_to_keycodes but it took a lot of effort to parse out what actually needed to be done.

Here is a distilled version:

0) Make sure Caps Lock is OFF

1) Create the following file:

/etc/udev/hwdb.d/10-caps-modifier.hwdb

---

evdev:atkbd:dmi:* # built-in keyboard: match all AT keyboards for now

KEYBOARD_KEY_3a=leftctrl # bind capslock to leftctrl

2) Run the following command:

$ systemd-hwdb update

3) Run the following command to make the remapping take immediate effect:

$ udevadm trigger

This was tested on Ubuntu 18.04.2 LTS running Xfce. The remapping works great in tty consoles, and the X input system.

02 June 2017

Print selection only in Chrome

Maybe you already know about "Print selection only" in Chrome. But it changed my life today.

I wanted to print only a part of a web page. Usually, I tweak the pages but then it spans pages and it's confusing to get just the pages I want. Or if I got desperate, I would copy/paste into a text editor and print that instead (after reformatting all the copy/paste noise away). Instead of all that nonsense, I found a better way.

Here's how to do it:

Select the text you want to print (in Chrome)
Click Print (or press Ctrl-P or Cmd-P on Mac)
Click "More Settings" in the Chrome print dialog
Select the "Selection only" box
Adjust "Scale" to get it on the right number of pages (1 page usually)

Then print and you can move on with your life. I love how simple it is. Hopefully you benefit from this.

03 May 2017

HELP: Happy, Evolving, Learning, Productive - Journaling template

Sometimes I go to write in my journal after lots has happened, and I find myself at a loss. Too much has happened, I don't know what's important to record, it's all a jumble. So I end up staring out the window and attempting a half-baked job of mentally processing things.

Here is a set of headings in an attempt to provide a skeleton for my journaling:

H: Happenings
E: Evolutions / Progress
L: Learnings
P: Plans

Or described more fully:

H: just neutral reporting on what random stuff has been going on
E: progress reports on things that have moved forward in some significant way in the last while
L: what you've recently learned that sticks out to you
P: what's coming up that has your attention

In an attempt at finding a mnemonic for HELP, I came up with the following:

H: Happy
E: Evolving
L: Learning
P: Productive

And here are some more detailed descriptions of the emotional states in the mnemonic:

H: just starting to write can keep you happier than not, see this post
E: highlighting progress, however small, turns into a sense of gratitude for me
L: seeing that I'm still learning is encouraging to me
P: putting rough plans on paper has a reassuring effect, and helps me move forward

Those adjectives have enough affinity to the headings that it might help me to remember them, and enough positive energy to help them to stick in my mind.

29 January 2017

Worry is a Signal, Not an Activity

"You look down today, what's going on?" my wife says when I get home from work. I answer, "I don't know, I was just worrying about this project at work." What's wrong with this picture?

The mental error is that I was treating worry like an activity instead of treating it like a signal. It's all about the self talk. When there is some outstanding issue that needs attention, it is easy to jump straight to thinking about the issue, even though you can't really do anything about it at the moment. The reality is that if I'm going to resolve the outstanding issue, I need my computer open, I need to talk with a team member to figure things out. I need to write some code or run a query to see where things stand.

But if I attempt to sort things out mentally while I don't have everything I need to make progress on an issue, it's easy to spin my wheels and fall helplessly into a non-productive mental loop.

On the other hand, if the "outstanding issue" thought comes into my mind, and I call it worry (which it is), and instead of holding onto that thought, if I treat it like a signal, like an alarm bell, like a red light, then that frees me up to act on the signal. Instead of treating the "outstanding issue" thought as an activity waiting to be engaged in, if I treat it as an self-alert, then I can move to deal with it at a later, more appropriate time.

The question becomes: "What action can I take right now to make sure that the outstanding issue is dealt with at the appropriate time and place?" Maybe make a reminder on an index card and put it in my work pants pocket. Maybe make a reminder on my phone. Maybe send myself a memory jogger in email. Maybe write a card and stick it in the Trello / Getting Things Done inbox. It just needs to be something that I am confident will get my attention and lead me down the right mental path at the time when I know I will have resources to deal with the issue.

Any time spent on worry as an activity (beyond dealing with the reminder for the future time/place) I now believe to be worse than a total waste. Not just worthless time spent, but also a drag on the rest of my life. Any unnecessary, anxiety-provoking activity drags me down, makes me less capable of living my life in a worthwhile, enjoyable way.

Why did I not see this earlier in my life? What could I have done to have learned this earlier in my adolescent / adult experience?

16 May 2013

Teaching Software

Today, I read "The Science in Computer Science" from Peter J. Denning in CACM (May 2013). Very interesting article.

The most interesting part to me were the references to how CS ideas are beginning to be taught in high school and elementary school. Here are some references that I found after looking into this:

http://csunplugged.org/open-source-edition-ms-word (fun, offline approach to teaching basics, CC licensed)
http://www.code.org/teach (great intro videos to kick off the idea, 1m, 5m, and 9m versions)
http://www.code.org/learn (links to sites to learn coding)
http://www.csprinciples.org/home/about-the-project (pilot project to reinvigorate AP CS course)

Here is one of those videos linked to above:

I need to volunteer to teach at my kids' school, and in a community education setting.

17 April 2013

Google Fiber in Provo, UT

This makes me want to move to Provo:

https://fiber.google.com/cities/provo/

That's something my wife thought she'd never hear! :)

04 October 2012

Rewriting Git History for Fun

For starters, if you're looking for a good JDBC connection pool, look no farther than c3p0 [github, doc].

Back a few years commons-dbcp wasn't very stable under load, so I went looking for an alternative and landed on c3p0. Ever since then, Steve Waldman has been making it better and better. Except for a 2-year hiatus when he wasn't working on it, he's been very responsive and willing to accept feedback and make improvements. I'm seriously impressed by the project.

Anyway, since I've been a long-time user and fan, when I saw Steve put his software on GitHub, I went for a look.

Here is what I saw:

But what about all those prior releases in source snapshot form, that I was used to seeing from SourceForge?

I realized that perhaps I could contribute a little to the project, so I went and grabbed all the source release zips from SourceForge, created a local git repo, created a commit for each release, tagged it, and spliced Steve's recent work on the top.

Then I submitted a GitHub issue to ask Steve what he thought -- wasn't really a pull request because it was a completely disconnected history.

Here is the top of the new history:

And here is the earliest part of that history:

The tools used were:

- git config author.name - to give credit where credit was due

- git config author.email

- curl - to download all the releases

- unzip | sed - to figure out what the commit date should be

- git commit --date "[release date]" - to create the commits

- vi .git/info/grafts - to temporarily splice the new history on the old

- git filter-branch --tag-name-filter - to rewrite the new history permanently on top

- git tag -f - to replace the existing release tags to point to the new rewritten history

- git push --tags - to push it up to github

If I could consult for projects / companies to do this kind of VCS conversion work and actually get paid for it -- wouldn't that be awesome!

10 January 2012

Awesome CLI gem: Commandable

Ever wanted to drive your ruby code with a CLI? Well, the commandable gem makes it easy.

At first, I struggled with OptionParser in the stdlib, but I had to duplicate my option handling across CLI code and method invocation. Then OptionParser didn't handle required parameters at all, requiring me to do required parameters in a non-DRY way. Also, OptionParse doesn't do sub-commands, git-style, which I needed for the new git plugin I'm working on.

When I saw the pickled_optparse gem (from Mike Bethany), it appeared to solve the missing 'required parameter' feature. And the subcommand gem seemed to allow for subcommands with their own options, git-style. However, the duplication across option definition and method invocation was still present. And even subcommand required you to hook up the command to the class/method that needed to be called.

Finally I found that the author of pickled_optparse had totally rethought CLI in ruby with his commandable gem. And I really like what he did.

Now there is no real additional logic AT ALL in order to hook your ruby class up to the command line. Now my option handling code will be in the place it belongs, in the code that is doing the command that I invoked from the command line.

The main awesomeness here is that commandable pushes the CLI code into the appropriate place in your ruby class -- it makes it hard to scatter CLI code all over the place.

My only beef with the gem are the following points:

it comes with colors enabled by default (unusual for a CLI)
it clears the screen by default when displaying help (when colors are enabled, very unusual and annoying for a CLI)
it appears to be hard to do global options like: "cmd --global subcommand [args]"
there is no support for parsing command CLI-style options into ruby hashes, but this is a minor nit, because that's what OptionParser is for, and it's easy to use inside the method

I fixed a problem where the screen was cleared even if you disabled colors, and I've pushed it up to my fork.

Here is my hello world that shows commandable in action:

And here's the output for a few representative cases:

31 December 2011

Useful Git Tips

Every one of these tips was useful to me:

http://mislav.uniqpath.com/2010/07/git-tips/

It just amazed me that these were written over a year ago, and I've been trying to learn as much about git as I could, but didn't even stumble across these in manpages.

I found this by looking for an option that would let "git remote update" only fetch a subset of the remotes I have attached.

It is rewarding to feel like I'm working with a rich toolset.

Thanks to @mislav for posting this.

BTW: Poking around in his tweet stream also yielded this gem. I've always wondered if there was a lightweight XPath for JSON, and there it is. Some background is in order, as much of the "XPath for Javascript" mentality was based on early JQuery thoughts.

21 December 2011

Intrusion Detection through Stackable Filesystems

I've always wondered what exploit might be running on my system, and never had any time to devise/install a detection system that would have the right balance of useful detection (maximize) and performance impact (minimize).

When I stumbled upon unionfs a couple weeks ago, I thought that was an interesting idea from a change-logging perspective. It's sometimes useful to be able to keep a filesystem-based diff of what a certain operation does to a system, and then bake it onto the system when I know it did what I wanted to. The takeaway for me was that unionfs's performance profile had the opportunity to be so low because it was so thin and so baked into the kernel.

This compounded with my recent discovery and fiddling with fusefs (in user-land) and I wondered about what kind of useful logic I could put underneath a filesystem. The GlusterFS feature set, the recent LessFS GC stuff, the bup-fuse stuff, and the S3FS stuff is all just *really* cool. I ended up gazing longingly at the big backlog of fuse filsystem suggestions on their wiki, and wandered back into the unionfs space.

So, when I saw the I3FS paper (linked from here) about the modular application of this technique to intrusion detection, this really triggered the Useful *AND* Performant neurons and I got really excited.

Unfortunately, on first glance, the stackable filesystems stuff seemed pretty cryptic to set up in a lightweight, just-works kind of way (think custom mount command-lines complete with arcane stacking options).

It would be soooo awesome to have an easy-to-compose ruby DSL for doing some kind of rack-like filesystem mashup with a kernel-level unionfs layer underneath a user-land fusefs layer, but all expressed in the same DSL.

This would be an awesome tool to put on top of the Arch-derived clone I want to put up for people at work. There are folks who care about living more on the edge of linux stuff, but that don't care to install from scratch, and also might not care that much about not having a full Gnome stack if things just work. And if I could give them the same tuned IDS-on-the-desktop solution (or upgrade their developer stack by letting them pull a filesystem delta over), that would be really cool.

The cheap development observation [PDF, linked from] because of modularity is one of attributes valued highly by the Ruby community as well. It is one the key things that makes Ruby as a community awesome.

These kinds of ideas really matter, and making them so cheap and stable that you don't have to think about them really matters even more.

NOTE: The I3FS paper is really pretty old (2004), and the whole unionfs stack is older than that. The fuse stack came into the kernel a long time ago too (2005). So while this is new to me, it's been around for quite a long time. I'm playing catch-up.

04 September 2009

An entire sprint without an SVN branch

A typical development flow has been:

request an SVN task branch (wait for CM to help)
do stuff (1-3 weeks)
run all the regression tests (not on the latest stable)
hope that nobody did anything that conflicted with us
merge to the pending release (and pray we resolve any conflicts correctly)

Now we do the following:

set up a team git repo (able to do this without CM's help)
clone off a repo for each team member
set up a daily cron job to get the latest stable from the pending release
set up a daily hudson build to merge that in and push to team if it passes unit tests
pull from team as we go (fast!)
run all the regression tests (on the latest stable)
use git to merge on top of the pending release and commit back to SVN (with minimal conflict windows)

Well, we finally got through a whole sprint without an SVN task branch. Maybe that seems like a small victory, but it saved us a bunch of late-breaking integration risk.

14 July 2009

Refreshing Take on OS Upgrade

I'm typically a laggard when it comes to upgrading my OS. I like for things to be stable -- I don't like to waste time fiddling with things unrelated to what I've got to get done for whatever project I'm working on.

So now in July, I'm upgrading to Ubuntu 9.04 (released in April). To state how much of a laggard I usually am, I think this is the most up-to-date I've ever been on an OS upgrade.

I decided to use the standard "upgrade to 9.04" feature exposed in "Update Manager". So when I saw a screen like this:

then I felt happy. I felt like I knew what was going to happen, and that I was in control of the process to some degree. This freed my mind up and let me think about what I could do to minimize any upgrade risk.

In the past, because of the helpless feeling of being out of control of what's going to happen and what I can expect, I've avoided upgrading until the pain became great enough to justify a full backup/reorg of all my files that I for sure wanted to keep.

20 March 2009

MountainWest RubyConf 2009

MountainWest RubyConf 2009 really helped me reorient in a very good way.

The emergent theme seemed to be "modularity".

Six main talks that were really enlightening:

Jon Crosby: Rack
Jeremy Evans: Sequel
Yehuda Katz: Rails Rewrite
Adam Blum: Rhodes
Jim Weirich: Unified Theory of Software Development
Alan Whitaker: The Sweet Ruby Life

Rack totally changed how I thought about building a webapp. Decomposing things into pieces that have the minimum to do with each other is possible in a really clean way with rack. And treating each of those pieces as a separate app, even, with its own release schedule, etc. -- that was a very refreshing view.

When I heard Jon say that the Perl script he had up on the screen was a monolith -- he was absolutely right. And I thought that the system *I* was working on was a monolith.

I was going to give a lightning talk about fshelper.org's link feature, but I wanted to rewrite it in broken-up rack pieces. I tried my best to do that between talks, but I didn't quite get the app to function as middleware, so I just gave up and listened.

09 March 2009

Test code structure

There are a lot of ways to build an unmaintainable test suite. Jay addresses this topic straight on. The most important idea I got out of it is this: "If It Hurts, You're Doing It Wrong."

Now how to get from painful to joyful ... that is the question. Probably by just applying common sense and proper code structure to tests, not just production code.

UPDATE

I've done my share of painful, stupid things:

the monolithic build system that had super-ant-tasks with laser vision
the event subsystem that was really just JMS
the custom deploy system that really should have been one of rsync or rpm
the object persistence layer that was supposed to be super-generic, but was really tied super-close to the domain objects
... I'm sure I could go on

The main thing I've learned is to work with the door open. And stay wide open to how to do things better and to always strive to see the things I'm missing.