ROOT rants: histogram hierarchy and a little PyROOT
I was doing some Google searching a couples days ago looking for answers to a ROOT question–I have a new one every day!– and I stumbled on a very nice rant about ROOT. It mentions the ugly default plotting style of that was the focus of my last post, and it hits on many points I would have made myself.
Sorry, if you don’t have any programming experience this post might be too technical. If so, may I instead offer you a large man on a small vehicle?
One thing I really liked is the comment about the crazy inheritance: how a 2D histogram (TH2) inherits from a 1D histogram (TH1). The reasoning for the ROOT authors seems to be that a 2D histogram can be thought of as a long 1D histogram “rasterized” onto a 2D field. In this (convoluted) way, the 2D histogram is a specific type of 1D histogram. But, as the author of that University of Minnesota page notes, there is no reason not to think of the relation going the opposite way: a 1D histogram is just a 2D histogram with only one bin in the second axis. And, this second relation seems a whole lot more obvious and fundamental. The structure the ROOT authors use doesn’t hurt them too badly because in reality most of the operations on histograms happen bin by bin. The implementation of an n-dimensional array very well may boil down to a 1D array; but, are we really doing the user any favors here?
With a TH2 inheriting from a TH1 certainly you would assume there is one advantage: the TH1 class won’t be cluttered with nonsensical methods like
GetYaxis(). Of course you would be wrong, just check it out. In fact, given the structure they’ve ended up with, I’m having a hard time coming up with a reason why they even need separate classes for 1D and 2D histograms.
And while I’m still speaking of histograms, the profusion of varieties must be noted: TH1C, TH1S, TH1I, TH1F, TH1D. When I first started using ROOT and hadn’t yet carefully read the documentation I assumed a TH1I would be used to histogram an integer valued parameter, in other words the x-axis would take integer values. Though most parameters we work with have continuous values, counts (usually called “multiplicities”) such as “how many electrons with energy greater than 10 GeV were in the event?” are also very important. Thus histograms over integer values would really fill a need. Of course, this is not what ROOT provides: all a TH1I does is promise to store each bin value (the number that defines the y-value on the graph) as an integer. This doesn’t change the fact that a TH1I can only be
Fill()ed using a floating point weight. The
Fill() method, and nearly every other method on this class, is defined generically using doubles in the TH1 parent class. Strangely, the functions for requesting a bin value are implemented in the integer specific TH1I class, so you might think that at least they would do the sensible thing and deal only in integers. Instead, the integer stored internally is cast as a double before being returned. As far as I can tell, there is no way to get unadulterated integers, that you know must be in there, out of this class. One might wonder if the different histogram varieties just offer different storage sizes, but then
sizeof(Int_t) == sizeof(Float_t), so it’s unlikely. Maybe there is a slight speed advantage when incrementing (though it is hard to imagine this is an issue with any processor having a dedicated floating point unit, taking us back at least 20 years)? I don’t know, I give up.
I think this is enough ROOT ranting for now. Possible topics for a further post
- How histograms and TTrees are owned by the directory they are created in (whereas similar objects like graphs are not). When you are new to ROOT this is guaranteed to lead to mysterious segmentation faults.
- Code that runs without errors or warnings in both compiled and interpreted modes, but produces different results.
- The vector classes: why does the 2D vector have to use
Mag()while the 3D vector uses
- Painful limitations to using STL classes like
std::vector<>in interpreted mode. (The reason I gave up on doing any substantial work using interpreted ROOT code.)
- Horrible crashing that refuses to let you quit even with
Oh, and regarding my recent issue that lead me to Google for answers: I’ve been using PyROOT a lot lately, but the underlying C++ bites you in the ass now and then. One issue I ran into is functions that modify values passed by reference. Python was designed to avoid this sort of thing, at least for the fundamental types, and so you have to do some annoying
array('i', ) machinations just to pass a reference to an integer into a function. Thankfully this doesn’t happen often, and it turns out the ROOT manual does explain the work-around well enough if you look in the PyROOT section [PDF] on TTrees. (Instead of a TTree, I was actually trying to use
TColor::HLS2RGB(), mostly foolishly, I must say.) On the whole, though, I would highly recommend PyROOT if you are doing anything high-level like making plots and you have to use ROOT.