BeBOP Optimized Sparse Kernel Interface Library  1.0.1h
Todo List
Global BLAS_xAXPY (const oski_index_t *restrict len, const oski_value_t *restrict alpha, const oski_value_t *restrict x, const oski_index_t *restrict incx, oski_value_t *restrict y, const oski_index_t *restrict incy)
Correctly implement the negative stride case (not used in BeBOP).
Global BLAS_xSCAL (const oski_index_t *restrict len, oski_value_t *restrict alpha, oski_value_t *restrict x, const oski_index_t *restrict stride)
Correctly implement the negative stride case (not used in BeBOP).
Global ChooseFastest (oski_matrix_t A_tunable)
Implement this routine, given that we will need to allocate temporary vectors.
Global CountZeroRows (oski_index_t m, const oski_index_t *ptr)

This routine duplicates the functionality of oski_CountZeroRowsCSR(), and could be eliminated.

This routine duplicates the functionality of oski_CountZeroRowsCSR(), and could be eliminated.

File error.c
Wrap error handler get/set routines in mutexes.
Group FAQ
Write a FAQ based on alpha-tester questions.
Global g_errlogfile
Make g_errlogfile a static global variable that the user can change to redirect errors elsewhere.
File matopts.h
Reimplement using Hoemmen's matrix generator.
Group MATTYPE_MBCSR
MBCSR currently has an overly strong interdependence on the BCSR data structure as defined in include/oski/BCSR/format.h because MBCSR contains a pointer to a BCSR object, and moreover initializes the fields of the BCSR object explicitly. We should weaken this dependence by implementing the submatrix instantiation functionality (see the defined but unused structure, oski_submat_t).
Group MIXSCALTYPES
Write a short explanation of how to recompile an application to use different scalar index and value types, and how to mix types.
Global OpenLua (void)
Create global OSKI-Lua matrix types for all registered types.
Global oski_CalcMatRepr1Norm (const void *mat, const oski_matcommon_t *props)
Fix the symmetric case; this is only an estimate.
Global oski_CheckArgsMatPowMult (const oski_matrix_t A_tunable, oski_matop_t opA, int power, oski_value_t alpha, const oski_vecview_t x_view, oski_value_t beta, oski_vecview_t y_view, oski_vecview_t T_view, const char *caller)
Need more verbose error messages in the event that the vectors have improper dimensions.
Global oski_CheckArgsMatPowMult (const oski_matrix_t A_tunable, oski_matop_t opA, int power, oski_value_t alpha, const oski_vecview_t x_view, oski_value_t beta, oski_vecview_t y_view, oski_vecview_t T_view, const char *caller)
Need more verbose error messages in the event that the vectors have improper dimensions.
Global oski_CheckPattern (const oski_index_t *ptr, const oski_index_t *ind, oski_index_t m, oski_index_t n, oski_index_t b, oski_inmatprop_t pattern)

Check symmetric and Hermitian full storage cases.

Add symmetric/Hermitian pattern check

Global oski_CreateLuaMatReprGenericFromCSR (lua_State *L, const char *mattype)
else: possible leak
Global oski_CreateMatReprFromCSR (const oski_matCSR_t *mat, const oski_matcommon_t *props,...)
This fill ratio is wrong for symmetric matrices.
Global oski_CreateSubmatReprFromCSR_funcpt )(const oski_matCSR_t *mat, const oski_matcommon_t *props, const oski_submat_t *sub,...)
Future functionality: implement submatrix instantiation.
Global oski_DestroyBCSRFillProfile (oski_fillprofile_BCSR_t *fill)
Does not need to be type-dependent.
Global oski_FreeInputMatRepr (oski_matrix_t A)
Possible leak?
Global oski_GetMatClique (const oski_matrix_t A_tunable, const oski_index_t *rows, oski_index_t num_rows, const oski_index_t *cols, oski_index_t num_cols, oski_vecview_t vals)
Test thoroughly!
Global oski_GetMatDiagValues (const oski_matrix_t A_tunable, oski_index_t diag_num, oski_vecview_t vals)
Test thoroughly!
Global oski_HeurEstimateCost (const oski_matrix_t A)
This implementation currently assumes our pessimistic upper-bound on the heuristic evaluation cost of 40 x (the time to stream through the matrix), and should be changed to use a build-time benchmark.
Global oski_InitMatTypeManager (void)
Should call this routine during library initialization.
Global oski_IsMatPermuted (const oski_matrix_t A_tunable)
Update this routine when Ali Pinar's TSP reordering code is implemented.
Class oski_matCSR_t
Add a flag to indicate whether the matrix has a full (all non-zero) diagonal so that the triangular solve kernel does not have to check this condition explicitly.
Global oski_MatReprMult (const void *A, const oski_matcommon_t *props, oski_matop_t opA, oski_value_t alpha, const oski_vecview_t x_view, oski_value_t beta, oski_vecview_t y_view)
Delete the following assignment statements, made obsolete by above call to oski_TransposeProps().
Global oski_MatReprMultAndMatReprTransMult (const void *A, const oski_matcommon_t *props, oski_value_t alpha, const oski_vecview_t x_view, oski_value_t beta, oski_vecview_t y_view, oski_matop_t opA, oski_value_t omega, const oski_vecview_t w_view, oski_value_t zeta, oski_vecview_t z_view)
What to do here if either of these calls fails?
Global oski_MatReprTrisolve (const void *T, const oski_matcommon_t *props, oski_matop_t opT, oski_value_t alpha, oski_vecview_t x_view)
Could theoretically encounter OP_CONJ
Global oski_matTRIPART_t
Fill in formal description of this format.
Global oski_matTRIPART_t
Fill in formal description of this format.
Global oski_MatTrisolve (const oski_matrix_t T_tunable, oski_matop_t opT, oski_value_t alpha, oski_vecview_t x_view)
For efficiency, this routine does not attempt to pre-scan the matrix data structure and ensure there are no zero diagonals. At least for CSR and CSC input matrices, we should add some kind of check somewhere (e.g., at matrix handle creation time). A similar to-do appears elsewhere in this source.
Global oski_ReallocInternal (void **p_ptr, const char *elem_type, size_t elem_size, size_t num_elems, const char *source_file, unsigned long line_number)
Implement oski_ReallocInternal().
Global oski_SetHint (oski_matrix_t A_tunable, oski_tunehint_t hint,...)
Restructure so that va_end() is always called.
Global oski_SetMatClique (oski_matrix_t A_tunable, const oski_index_t *rows, oski_index_t num_rows, const oski_index_t *cols, oski_index_t num_cols, const oski_vecview_t vals)
Test thoroughly!
Global oski_SetMatDiagValues (oski_matrix_t A_tunable, oski_index_t diag_num, const oski_vecview_t vals)
Test thoroughly!
Global oski_TuneMat (oski_matrix_t A_tunable)

The current implementation does not try to re-tune if already tuned.

Check that the new data structure really is faster than the old.

Global oski_ViewPermutedMat (const oski_matrix_t A_tunable)
Update this routine when Ali Pinar's TSP reordering code is implemented.
Global oski_ViewPermutedMatColPerm (const oski_matrix_t A_tunable)
Update this routine when Ali Pinar's TSP reordering code is implemented.
Global oski_ViewPermutedMatRowPerm (const oski_matrix_t A_tunable)
Update this routine when Ali Pinar's TSP reordering code is implemented.
Global oski_WrapCSR (oski_matcommon_t *out_props, oski_index_t *Aptr, oski_index_t *Aind, oski_value_t *Aval, oski_index_t num_rows, oski_index_t num_cols, oski_inmatpropset_t *props, oski_copymode_t mode)

The output properties data structure actually defines a more general property about the diagonal, namely, that it is all ones. However, the available input matrix properties only allow the user to specify whether or not there is an implicit unit diagonal. Thus, it is possible that the user could create an input matrix with an explicit unit diagonal, but this condition is not checked when wrapping the data structure. It might be desirable to do this to make optimized triangular solve for the unit diagonal case more efficient.

Similarly, the oski_matCSR_t data structure has "is_upper" and "is_lower" flags, which could be set even if the user asserts that the matrix has a "general" pattern.

Global ScatterBlockRow (const oski_index_t *ptr, const oski_index_t *ind, const oski_value_t *val, oski_index_t n, oski_index_t indbase, oski_index_t i0, oski_index_t r, oski_index_t c, oski_index_t d0, oski_index_t *has_block_col, oski_index_t *inds, oski_value_t *blocks, oski_value_t *diag)
Update documentation for this routine.
File src/init.c
Make this module thread-safe!
Global SymmMatMult (const oski_matCSR_t *A, const oski_matcommon_t *props, oski_matop_t opA, oski_value_t alpha, const oski_vecview_t x_view, oski_vecview_t y_view)
Test the case when indices are unsorted.
Global testmat_GenDenseCSR (oski_index_t m, oski_index_t n, oski_index_t **p_ptr, oski_index_t **p_ind, oski_value_t **p_val)
What is the best way to initialize the dense matrix? For benchmarking, it seems sufficient to initialize all entries to 'tiny' values (here, $\frac{1}{\max{m}{n}}$), which is faster than calling the random number generator.
File testvec.h
testmat_ChooseDim and ChooseDivisible do not actually need to be build-type dependent.
File upper_conj.c
The case of sorted indices assumes not only that the indices are sorted, but also that there is a unique diagonal element. Do we need to fix this? The general ordering case makes no such assumption.
File upper_normal.c
The case of sorted indices assumes not only that the indices are sorted, but also that there is a unique diagonal element. Do we need to fix this? The general ordering case makes no such assumption.