A popular solution for scientific computing with Python is numpy (previous instances were Numpy and numarray).
rpy2 has features for facilitating the integration with code using numpy in both directions: from rpy2 to numpy, and from numpy to rpy2.
Vectors can be converted to numpy arrays using array() or asarray():
import numpy
ltr = robjects.r.letters
ltr_np = numpy.array(ltr)
This behavior is inherited from the low-level interface, and is means that the objects presents an interface recognized by numpy, and that interface used to know the structure of the object.
The conversion of numpy objects to rpy2 objects can be activated by importing the module numpy2ri:
import rpy2.robjects.numpy2ri
That import alone is sufficient to switch an automatic conversion of numpy objects into rpy2 objects.
Note
Why make this an optional import, while it could have been included in the function py2ri() (as done in the original patch submitted for that function) ?
Although both are valid and reasonable options, the design decision was taken in order to decouple rpy2 from numpy the most, and do not assume that having numpy installed automatically meant that a programmer wanted to use it.
import rpy2.robjects as ro
import rpy2.rinterface as rinterface
import numpy
def numpy2ri(o):
if isinstance(o, numpy.ndarray):
if not o.dtype.isnative:
raise(ValueError("Cannot pass numpy arrays with non-native byte orders at the moment."))
# The possible kind codes are listed at
# http://numpy.scipy.org/array_interface.shtml
kinds = {
# "t" -> not really supported by numpy
"b": rinterface.LGLSXP,
"i": rinterface.INTSXP,
# "u" -> special-cased below
"f": rinterface.REALSXP,
"c": rinterface.CPLXSXP,
# "O" -> special-cased below
"S": rinterface.STRSXP,
"U": rinterface.STRSXP,
# "V" -> special-cased below
}
# Most types map onto R arrays:
if o.dtype.kind in kinds:
# "F" means "use column-major order"
vec = rinterface.SexpVector(o.ravel("F"), kinds[o.dtype.kind])
dim = rinterface.SexpVector(o.shape, rinterface.INTSXP)
res = ro.r.array(vec, dim=dim)
# R does not support unsigned types:
elif o.dtype.kind == "u":
raise(ValueError("Cannot convert numpy array of unsigned values -- R does not have unsigned integers."))
# Array-of-PyObject is treated like a Python list:
elif o.dtype.kind == "O":
res = ro.conversion.py2ri(list(o))
# Record arrays map onto R data frames:
elif o.dtype.kind == "V":
if o.dtype.names is None:
raise(ValueError("Nothing can be done for this numpy array type %s at the moment." % (o.dtype,)))
df_args = []
for field_name in o.dtype.names:
df_args.append((field_name,
ro.conversion.py2ri(o[field_name])))
res = ro.baseNameSpaceEnv["data.frame"].rcall(tuple(df_args))
# It should be impossible to get here:
else:
raise(ValueError("Unknown numpy array type."))
else:
res = ro.default_py2ri(o)
return res
ro.conversion.py2ri = numpy2ri
The rpy2.rinterface.SexpVector objects are made to behave like arrays, as defined in the Python package numpy.
The functions numpy.array() and numpy.asarray() can be used construct numpy arrays:
>>> import numpy
>>> rx = rinterface.SexpVector([1,2,3,4], rinterface.INTSXP)
>>> nx = numpy.array(rx)
>>> nx_nc = numpy.asarray(rx)
Note
when using asarray(), the data are not copied.
>>> rx[2]
3
>>> nx_nc[2] = 42
>>> rx[2]
42
>>>
A popular solution for scientific computing with Python is numpy (previous instances were Numpy and numarray).
rpy2 has features for facilitating the integration with code using numpy in both directions: from rpy2 to numpy, and from numpy to rpy2.
Vectors can be converted to numpy arrays using array() or asarray():
import numpy
ltr = robjects.r.letters
ltr_np = numpy.array(ltr)
This behavior is inherited from the low-level interface, and is means that the objects presents an interface recognized by numpy, and that interface used to know the structure of the object.
The conversion of numpy objects to rpy2 objects can be activated by importing the module numpy2ri:
import rpy2.robjects.numpy2ri
That import alone is sufficient to switch an automatic conversion of numpy objects into rpy2 objects.
Note
Why make this an optional import, while it could have been included in the function py2ri() (as done in the original patch submitted for that function) ?
Although both are valid and reasonable options, the design decision was taken in order to decouple rpy2 from numpy the most, and do not assume that having numpy installed automatically meant that a programmer wanted to use it.
Note
The module numpy2ri is an example of how custom conversion to and from rpy2.robjects can be performed.
The rpy2.rinterface.SexpVector objects are made to behave like arrays, as defined in the Python package numpy.
The functions numpy.array() and numpy.asarray() can be used construct numpy arrays:
>>> import numpy
>>> rx = rinterface.SexpVector([1,2,3,4], rinterface.INTSXP)
>>> nx = numpy.array(rx)
>>> nx_nc = numpy.asarray(rx)
Note
when using asarray(), the data are not copied.
>>> rx[2]
3
>>> nx_nc[2] = 42
>>> rx[2]
42
>>>