C++ data structures update#47
Conversation
…nto update_from_gudhi
DavidLapous
left a comment
There was a problem hiding this comment.
Thanks Hannah for this PR, I did a first quick pass over the code today.
| * @ingroup multi_filtration | ||
| * | ||
| * @brief Class encoding the different generators, i.e., apparition times, of a \f$ k \f$-critical | ||
| * \f$\mathbb R^2\f$-filtration value in a degree-Rips filtration. That is, a \f$ k \f$-critical filtration value is |
There was a problem hiding this comment.
I didn't find a common name yet. Maybe Degree_geometric_bifiltration or Multi_geometric_bifiltration or ...
| * \f$\mathbb R^2\f$-filtration value in a degree-Rips filtration. That is, a \f$ k \f$-critical filtration value is | |
| * \f$\mathbb R^2\f$-filtration value in a degree-Rips-like filtration. That is, a \f$ k \f$-critical filtration value is |
@odinhg Do you have an idea of a name that encompasses all such filtrations, i.e., multicover, degree-rips, core, etc that can be encoded this way ?
Also, as python can efficiently reparametrize filtrations, this can (without any change in the python interface), any bifiltration could be encoded this way when the cpp type parameter is an integer.
-- > We can also use a generic name like flat_multicritical_bifiltration
I'll update the doc when we settle on something
There was a problem hiding this comment.
I think flat_multicritical_bifiltration is fine and generic enough to capture potential new bifiltration on the same form. The ones mentioned are all parameter-free density-based bifiltrations, but naming it by these properties might end up being too specific, and it doesn't really convey anything about the data structure.
There was a problem hiding this comment.
Is it not a bit too general or does the "flat" convey enough the constrains on the first parameter values? It does not sound very convenient to represent any random bi-filtration such way, only those whose first parameter is somehow parametrizable such that they can be mapped to the sequence of natural numbers.
There was a problem hiding this comment.
I think the point is the following : for these kinds of
So, as we delegate the reparametrization to python, this structure can efficiently encode any dense (in terms of
So would something like flat_dense_multicritical_bifiltration be ok ? I'm not sure about the dense (nor the flat), as it's not very explicit
There was a problem hiding this comment.
Honestly, I think you can omit the multicritical in the name (Flat_dense_bifiltration? Or just Flat_bifiltration in the end? At least the "flat" makes it clear that is does not encode any bifiltration, even if the meaning is not very clear.
There was a problem hiding this comment.
well if you want to encode a 1-critical generator at [x,k], you'll need room for [x,1]=inf, [x,2]=inf, ... [x,k-1]=inf, before. So it's really not well suited for 1-critical filtrations.
There was a problem hiding this comment.
It is not well-suited indeed, but not wrong. I don't think we should charge the class name too much (as it is an extensive use class, its name can pop up a lot in a single code if no alias was used), if it is not necessary.
| Degree_rips_bifiltration(Underlying_container &&generators, [[maybe_unused]] int number_of_parameters) | ||
| : generators_(std::move(generators)) | ||
| { | ||
| if constexpr (Ensure1Criticality) { | ||
| if (generators_.size() > 1) throw std::logic_error("Multiparameter filtration value is not 1-critical."); | ||
| } | ||
| } |
There was a problem hiding this comment.
| Degree_rips_bifiltration(Underlying_container &&generators, [[maybe_unused]] int number_of_parameters) | |
| : generators_(std::move(generators)) | |
| { | |
| if constexpr (Ensure1Criticality) { | |
| if (generators_.size() > 1) throw std::logic_error("Multiparameter filtration value is not 1-critical."); | |
| } | |
| } | |
| Degree_rips_bifiltration(Underlying_container &&generators, [[maybe_unused]] int number_of_parameters = 2) | |
| : generators_(std::move(generators)) | |
| { | |
| if (number_of_parameters != 2){ | |
| throw std::invalid_argument("This filtration can only encode. Got "+ std::to_string(number_of_parameters) + "."); | |
| } | |
| if constexpr (Ensure1Criticality) { | |
| if (generators_.size() > 1) throw std::logic_error("Multiparameter filtration value is not 1-critical."); | |
| } | |
| } |
There was a problem hiding this comment.
So, you want to consider the use of any other parameter as wrong, not just ignoring it? Should we not this for all the constructors then?
| static Degree_rips_bifiltration inf(int number_of_parameters = 2) | ||
| { | ||
| if constexpr (Co) { | ||
| throw std::logic_error("No biggest value possible for Co-filtrations yet."); |
There was a problem hiding this comment.
I'd put -inf by default, unless this cases an issue
There was a problem hiding this comment.
inf(p) returns the biggest possible value for Co == false, but the smallest possible value for Co == true (behavior we can change if you want) and vice versa for minus_inf(p). And representing the smallest possible negative corner with the underlying format and finitely many generators is not possible...
There was a problem hiding this comment.
I'm not sure I got that. Morally, if you use co (which I think should be called op I don;t know where the co comes from, maybe from co-relation) you're reversing the poset, i.e.
So, apart from int filtration values which would be shifted by one, I dont understand where is the issue.
Anyway, we're not using this yet, so we can deal with that later
There was a problem hiding this comment.
Exactly because we reverse the poset, Co == true. You cannot achive that with
| auto d = std::distance( | ||
| values.begin(), | ||
| std::lower_bound( | ||
| values.begin(), values.end(), static_cast<typename OneDimArray::value_type>(generators_[g]))); |
There was a problem hiding this comment.
A later todo:
This projects it to the right (which is sound), but due to float imprecisions I think I now prefer to fix this, with something like
multipers/multipers/multiparameter_module_approximation/approximation.h
Lines 1481 to 1486 in ab68db5
maybe as a template ?
| /** | ||
| * @brief Builds a new complex by reordering the cells in the given complex with the given permutation map. | ||
| */ | ||
| friend Multi_parameter_filtered_complex build_permuted_complex(const Multi_parameter_filtered_complex& complex, |
There was a problem hiding this comment.
| friend Multi_parameter_filtered_complex build_permuted_complex(const Multi_parameter_filtered_complex& complex, | |
| friend Multi_parameter_filtered_complex build_permuted_filtered_complex(const Multi_parameter_filtered_complex& complex, |
?
There was a problem hiding this comment.
I found that the input/output type already explicit on this, so I don't feel the need to have a long method name. But as you prefer.
| // TODO: This doesn't allow for negative dimensions | ||
| // Hannah: not sure what this comment means ? |
There was a problem hiding this comment.
Technically, negative things could be useful, but not yet. So let's ignore this ftm
| if len(default_values) > num_parameters: | ||
| del default_values[num_parameters:] |
There was a problem hiding this comment.
This looks cursed haha. I'll see what I can do later for this, in particular, multipers.array_api should be able to take care of this.
There was a problem hiding this comment.
But it works :D
But, yes, don't hesitate to make a cleaner version of this. I was surprised how weirdly complicated a resize is in python.
There was a problem hiding this comment.
I guess your issue is that if you do a
default_values = default_values[num_parameters:]then default_values becomes a local variable and doesn't affect the original one.
There are two options :
- use the nonlocal keyword, i.e.,
nonlocal default_values
default_values=default_values[num_parameters:]- or cheat a little bit with something like:
default_values[:]=default_values[num_parameters:]In any case, I'll have to put that in the multipers.array_api interface to recover the autodiff in default_values if it exists, so I'll fix that later.
There was a problem hiding this comment.
But at that line I want to delete default_values[num_parameters:] (i.e., prune all values at indices higher than num_parameters in default_values). default_values=default_values[num_parameters:] is doing the opposite. Or did you meant default_values=default_values[:num_parameters-1]?
| # TODO: Is there no directer way to convert a T[:] into a vector[T]? | ||
| # A memcpy would be much quicker than a python/cython for loop... | ||
| # With a continuous memory use, we could also pass the pointers as iterators if we have access to it? |
There was a problem hiding this comment.
I think the compiler is smart enough. As you have the nogil flag, this converts to pure cpp code. IIRC, as f is a vector, f[i] converts to *(f.data + i) and the same for filtration[i] with cython (with the same pointer type). I guess the compiler can then guess the memcpy, non ?
There was a problem hiding this comment.
No, in the produced cpp file, there is also a for loop. I was more thinking of a bit by bit copy. At least, the loop is a C++ one, not a python one and the vector size is allocated from the beginning. So, it should not make much of a difference. It is just that we are calling this method a lot.
There was a problem hiding this comment.
Yes there will be a loop in the cpp file, but this is an "easy optimization" for the compiler, so I hope that with high prob the generated assembly will have a memcpy.
aida stuff needs to be linked
…nto update_from_gudhi
…into update_from_gudhi
…nto update_from_gudhi
DavidLapous
left a comment
There was a problem hiding this comment.
Not totally finished (in particular degree-rips and stuff) but I don't want to manage multiple branches or PRs over this PR so LGTM.
In this PR
trueto ensure that the value remains 1-critical (to optimize some loops and throw if the user tries to add a generator which breaks this property).Dynamic_multi_parameter_filtration(corresponding to the two old ones merged),Multi_parameter_filtration(same than the other, but the generators are aligned in a single vector instead of a vector of generators) andDegree_rips_filtration(specialized class for Rips bi-filtration of the formDynamic_multi_parameter_filtrationis used for now, as the goal of the PR was to integrate the new interface before anything elsemultipers/gudhifolder:gudhi/Simplex_tree_multi.his nowgudhi/Multi_filtration/multi_filtration_utils.h, except for the methodlinear_projectionas I was told to not integrate it into Gudhi yet. So it was moved temporarily intoSimplex_tree_multi_interface.h,cubical_to_boundary.his nowbuild_[complex/slicer]_from_bitmapingudhi/slicer_helpers.h,mma_interface_coh.his nowgudhi/Multi_persistence/Persistence_interface_cohomology.handgudhi/Multi_persistence/Multi_parameter_filtered_complex_pcoh_interface.h,mma_interface_h0.handnaive_merge_tree.hare now in the subfoldertmp_h0_persas they are useless in the current state,mma_interface_matrix.his nowgudhi/Multi_persistence/Persistence_interface_matrix.h,scc_io.his transferred intogudhi/slicer_helpers.h,truc.his now decomposed intogudhi/Slicer.h,gudhi/Thread_safe_slicer.h,gudhi/Multi_parameter_filtered_complex.handgudhi/Projective_cover_kernel.h(and the Truc constructor from a simplex tree is now externalize ingudhi/slicer_helpers.h).BoxandLineuse a properPointclass now instead of filtration values, which is much cleaner. The new point class has the same properties as a vector, but has additionally all arithmetic properties of the filtration valuesis_multi_parameterparameter in its options anymore. We agreed with Vincent and Marc that it was fine to haveint number_of_parameters_as a permanent attribute (i.e. not optional) with default value at1number_of_parameters_attribute in this PR, as it is used to convertgudhi.SimplexTreetomultipers.SimplexTreeand therefore they methods have to agree. But once the version withnumber_of_parameters_is officially released in Gudhi, the attribute will be properly taken into accountTo do before merging (for @DavidLapous)
TODO:s in the doc. There are very few of them. There are also the introduction pages for Gudhi (so not existing in this repo, but here)To do for the next PR(s)
Multi_parameter_filtered_complexclass needs some changes. For now, it is very basic, as I wanted just to separate it from the Slicer class, but I would like to experiment with it in the future. Having for example an alternative where all containers are continuous vectors (and not vectors of vectors more or less), which would not be usable with its current interface. But as the class is internal, any changes to it will not have any impact on the interface of Slicer or anything in Python. I also do not think it is priority, the point above is more important.