Units¶
sire
has a full units system, meaning that every value can be
properly associated with its accompanying physical unit.
The easiest way to create a value with a unit is to use the
sire.u()
function.
>>> import sire as sr
>>> print(sr.u("5 nanometers"))
50 Å
>>> print(sr.u("6.5 kcal mol-1"))
6.5 kcal mol-1
>>> print(sr.u("35°"))
35°
There is a robust grammar and parser that is used to convert pretty much all string representations of units. This can by via the long form of the unit
>>> print(sr.u("10 kilocalories per mole"))
10 kcal mol-1
or the short form
>>> print(sr.u("10 kcal.mol-1"))
10 kcal mol-1
It recognises all of the SI prefixes, both in long and short form
>>> print(sr.u("10 microseconds"))
1e+07 ps
>>> print(sr.u("10µs"))
1e+07 ps
and has a specially-built grammar to combine units together
>>> print(sr.u("300 nm ps-2"))
3000 Å ps-2
Note
Most functions that take a unit as an argument will automatically
convert strings to units using sire.u()
. This means that you
often won’t need to call sire.u()
explicitly.
Unit vectors¶
In addition to scalars, sire
also supports unit vectors for
lengths, velocities and forces. You can construct these using the
sire.v()
function. This accepts numbers, strings, lists,
dictionaries or anything that can be converted via sire.u()
.
>>> print(sr.v(1, 2, 3))
( 1 Å, 2 Å, 3 Å )
>>> print(sr.v("1 A", "2 A", "3 A"))
( 1 Å, 2 Å, 3 Å )
>>> print(sr.v(3, 4, 5, units="A"))
( 3 Å, 4 Å, 5 Å )
>>> print(sr.v(x="1 A ps-1", y="2 A ps-1", z="3 A ps-1"))
( 1 Å ps-1, 2 Å ps-1, 3 Å ps-1 )
>>> print(sr.v([3,4,5], units="A ps-1"))
( 3 Å ps-1, 4 Å ps-1, 5 Å ps-1 )
Converting to other units¶
By default, the values are printed in internal units (see later). You
can convert them to a value of any compatible units using the
to()
function e.g.
>>> print(sr.u("100m").to("kilometer"))
0.1
>>> print(sr.u("25 celsius").to("K"))
298.15
>>> print(sr.u("15 kcal.mol-1").to("J mol-1"))
62760.0
Supported units¶
sire
supports a large number of units. You can use both the short
and long name of the unit, and can use the singular or plural form for
the long name. For example;
>>> print(sr.u("10 kcal"))
10 kcal
>>> print(sr.u("10 kilocalorie"))
10 kcal
>>> print(sr.u("10 kilocalories"))
10 kcal
Note that sire
uses A
as the short form for angstrom
. It is
not interpreted as ampere
(you need to use the full name for ampere
).
This is because sire
is focussed on molecular simulation, where
A
is commonly used to mean angstrom
. You can also use Å
.
Here is the full set of supported units.
Long units
calorie, joule, hartree, mole, dozen, radian, degree, angstrom, meter, bohr, inch inches, foot, feet, yard, mile, second, minute, hour, day, week, fortnight, akma, dalton, gram, tonne, newton, ounce, pound, stone, hundredweight, pascal, bar, atm, atmosphere, psi, mmHg, kelvin, celsius, fahrenheit, amp, ampere, volt, farad, watt, electron, e_charge, mod_electron, faraday, coulomb, kcal_per_mol, kJ_per_mol
Short units
cal, J, Ha, mol, rad, °, Å, A, m,
"
,'
, in, ft, mph, kph, s, g, N, Pa, K, °K, °C, °F, V, F, W, e,|e|
, C
Note
Parsing is case-sensitive. So 1 joule
would parse correctly, while
1 Joule
would raise an error.
SI prefixes¶
All of the SI prefixes
from quecto
to quetta
are supported, in both long and short forms.
>>> print(sr.u("5 picometers"))
0.05 Å
>>> print(sr.u("5 pm"))
0.05 Å
>>> print(sr.u("10 megajoules"))
2390.06 kcal
>>> print(sr.u("10 MJ"))
2390.06 kcal
This includes using u
, µ
or μ
as the short version of micro
.
>>> print(sr.u("5 μs"))
5e+06 ps
Raising units to a power¶
You can raise a unit to a power using the following symbols.
the number
e.g.m 3
,m3
,mol-1
**
e.g.m**3
,m ** 3
,mol**-1
^
e.g.m^3
,m ^ 3
,mol^-1
Powers can be positive or negative, but must always be integers.
sire
doesn’t support raising units to fractional powers.
Combining units¶
The following symbols can be used to multiply units together.
a space
e.g.m s-1
,kcal mol-1
*
e.g.m*s-1
,m * s-1
,kcal*mol-1
,kcal * mol-1
.
e.g.m.s-1
,kcal.mol-1
The following symbols can be used to divide units.
/
e.g.m/s
,m / s
,kcal/mol
,kcal / mol
per
e.g.m per s
,kcal per mol
Units are combined from right to left, meaning that kcal / mol / A**2
is evaluated as kcal / (mol / A**2)
.
You can use round brackets to control the order of evaluation, e.g.
(kcal / mol) / A**2
would give the molar energy per square angstrom.
Note
Note that per
can only be used to combine individual units, e.g.
kcal per mol
, not kcal per (mol / A**2)
. Also note that
per
is evaluated first, and only between the two units it
is placed between. So kcal per mol / A**2
will be evaluated
as (kcal per mol) / A**2
.
Good rules of thumb are to use per
when you want to create a derived
unit such as miles per hour
or kcal per mol
, and to use /
only
with round brackets to make sure that you get the order of evaluation
that you intend. Alternatively, do not use division at all, but instead
raise units to negative powers, e.g. miles hour-1
or kcal.mol-1
.
Changing default units¶
sire
prints values out using default output units. You can change
these using the functions in sire.units
, e.g.
sire.units.set_si_units()
will change the output to SI units,
while sire.units.set_internal_units()
will change the output to
internal (AKMA-style) units.
Changing the output units just changes how they are printed. It doens’t
change their internal representation. For more info, see the section
below on Under the hood - GeneralUnit
.
You can set your own default units by calling
sire.units.set_default_unit()
or sire.units.set_default_units()
.
For example
>>> sr.units.set_default_unit("kJ mol-1")
will set the default molar energy unit to kJ mol-1
, while
>>> sr.units.set_default_units(["m", "J", "kg", "C", "rad", "K", "mol"])
will set the default units for the seven physical dimensions to the passed values.
Note
The strings passed must be parseable into a unit. This is then set as the default for that unit, with the string used to represent that unit on output.
Note
Units that don’t have a default are constructed from the defaults of the seven physical dimensions.
Indeed, sire.units.set_si_units()
is really just calling
sire.units.set_default_units()
with a list of units that are
commonly set as part of the SI system (including derived units such
as J
for energy, W
for power, N
for force, etc.)
You can clear the set of default units by calling
sire.units.clear_default_units()
.
Conversion from pint
¶
The sire.u()
function can auto-convert from other units systems.
For example, you can pass in units created via
pint.
>>> import pint
>>> ureg = pint.UnitRegistry()
>>> distance = 24.0 * ureg.meter
>>> print(sr.u(distance))
2.4e+11 Å
>>> print(sr.u(distance).to(sr.u(ureg.centimeter)))
2400
Conversion from BioSimSpace
¶
The sire.u()
function can auto-convert from
BioSimSpace too!
>>> import BioSimSpace as BSS
>>> import sire as sr
>>> distance = 3.5 * BSS.Units.Length.angstrom
>>> print(sr.u(distance))
3.5 Å
Conversion from other packages¶
Indeed, sire.u()
can autoc-convert from any units package
that can convert to a standard units string. By default, if
sire.u()
does not recognise the type, then it converts
the unit to a string, and then tries to parse it using the
in-built grammar. This should work for most cases, especially
if the other package can print units in a standard, human-readable way.
Under the hood - GeneralUnit
¶
sire.u()
works by parsing the string using a grammar that is built
on top of the sire.units.GeneralUnit
class. This class holds
the unit as a combination of a value and the physical dimension of the unit.
For example, 5 m
is 5
times a physical length (L
). There
are seven physical dimensions:
Mass (
M
)Length (
L
)Time (
T
)Charge (
C
)temperature (
t
)Quantity (
Q
)Angle (
A
)
Every physical unit is a combination of these. For example, kcal
is energy
, which is M2 L2 S-2
(remember, E = mc2
).
Similarly, kcal mol-1
is energy / Quantity
, so M2 L2 S-2 Q-1
.
The value of each physical dimension of each unit can be queried via the
functions of GeneralUnit
, e.g.
MASS()
returns the power of the M dimension.
Internally, each dimension has a base unit which is used for scaling all
values along that dimension. The base units represent 1.0
for that
dimension. sire
has base units chosen that lead to the highest precision
and best performance for the dimensional scale on which it operates
(namely the atomic scale). It uses the AKMA
system, which is very common for molecular simulation codes.
Mass :
dalton
(chosen so1 g mol-1
equals1.0
)Length :
angstrom
Time :
akma
(chosen so that a time of1.0
is compatible with the other units with no need for any scaling factors. It is approximately20.455 ps
)Charge :
absolute electron charge
(chosen so a proton has charge1.0
and an electron has charge-1.0
)temperature :
kelvin
Quantity :
1
Angle :
radian
A value of 5 meters
is thus stored internally as 5e10 * Length
,
while 100 ps
is stored internally as 2045.48 Time
. You can
get the internal value of any unit by calling the
value()
function, e.g.
>>> print(sr.u("100 ps").value())
2045.4828280872953
The choice of internal base units is almost invisible though, as
sire
performs conversion from and to default output units whenever
a value is created or printed. The default output unit for time is
picoseconds
, so 100 ps
when printed, will be converted from
2045.48 Time
to 100 ps
on output.
>>> print(sr.u("100 ps"))
100 ps
You can control the default output units for different functions using
the functions in sire.units
. For example, calling
sire.units.set_si_units()
will change the default output units
to SI values.
>>> print(sr.u("10 kJ mol-1"))
2.39006 kcal mol-1
>>> sr.units.set_si_units()
>>> print(sr.u("10 kJ mol-1"))
10 kJ mol-1
Note
Changing the output units does not change how the units are stored
in sire
. It just changes the scaling factors used to convert
the units to/from input and output.
You can restore the default units using sire.units.set_internal_units()
>>> sr.units.set_internal_units()
>>> print(sr.u("10 kJ mol-1"))
2.39006 kcal mol-1
You can also set individual units, e.g.
sire.units.set_mass_unit()
, sire.units.set_energy_unit()
etc.
Under the hood - Python to C++¶
In the Python layer, sire
stores the value in a
GeneralUnit
object. This is a wrapper around the
C++ class of the same name. This C++ class is used as a temporary
intermediary to convert to templated PhysUnit<M,L,T,C,t,Q,A>
objects.
These are template metaobjects, which store the physical dimension as
parameters held in the type of the C++ object (the M,L,T,C,t,Q,A
parameters to the template). The object itself is just a standard
double
, which holds the magnitude for the unit. This means that
a vector of units is just a vector of doubles. All of the unit checking
and unit code is handled via template metafunctions which are evaluated
at compile time. This means that unit types do not take up any more
space or any more compute time than plain double precision numbers.
These templated PhysUnit<M,L,T,C,t,Q,A>
types are automatically
created from the C++ GeneralUnit
class on function calls, and
are automatically converted back to a C++ GeneralUnit
class
if a unit is returned (with this being wrapped up and exposed via
the GeneralUnit
Python wrapper).
In addition, the C++ Vector
class, which represents a 3D point in space,
is automatically converted to hold Length
types when it is queried
from the Python layer. Internally, it just holds three double precision
numbers. These are automatically converted to (or converted from) Length
types when queried from C++ or Python. This minimises memory usage
and maximises compute speed.