Ding, Weihao (2017-08). Experiment of Mapper Algorithm on High-Dimensional Data in Microseismic Monitoring. Master's Thesis. Thesis uri icon

abstract

  • The objective of this research to utilize data driven methods to analyze microseismic monitoring, especially using Topological data analysis (TDA) with limited physically based approaches. Python Mapper (PM) is the tool of TDA for this study. Microseismic data has great characteristics of big data. Previous studies suggesting stage-by-stage microseismic analysis also avoid the limitation of current software, which can only process slightly over 10,000 data points. During this study, more TDA packages are constantly evolving to handle larger and more complex data such as Betti Mapper by Spark. PM is a tool by combining topology principles and machine learning methods into an integrated data analytic implementation. The high-dimensionality of microseismic data practically limits what classical statistical analyses can achieve. Machine learning techniques such as dimensionality reduction are required for such datasets. Where PM stands out is its ability to retain the raw feature of data set when machine-learning algorithm is applied. The first portion of the study is to observe the data point relation of microseismic data entirely and stage-by-stage. Dividing attributes into location and signal data reveals the relation within and between two different data types. The main discovery from location data of network is the high density areas are tend to be earlier events and could locate where high pressure start to build up, or the origins of the fracture networks. Origins that are far apart in the beginning grow into each other to result in one (most of the time) or more (rarely more than two) networks. The fracture growth with complex directions of extensions can be represented with a much simpler, single-directional network. Signal data reveals location-specific data quality trends. These trends are hardly visible if attributes are investigated in pairs but obvious when mapped altogether. Locational and geological characteristics may be an explanation, but this needs further information to prove the observations. In fracture growth softwares, these trends will allow researchers to ignore the location of the wellbore and focuses at the actual origins of the fracture network. An override including discontinuity of the network and confidence of stimulated reservoir volume could be manually added to improve the accuracy of the fracture simulation. A sensitivity analysis to PM parameters is carried out to test the robustness of the method and comparing raw data clustering method to prove the effectiveness and benefits of using TDA. TDA is a great method for data preprocesses, analyses, and has virtually infinite possibility, but should never be the end of a project. The results from PM could be used as input for many other studies.
  • The objective of this research to utilize data driven methods to analyze
    microseismic monitoring, especially using Topological data analysis (TDA) with limited
    physically based approaches. Python Mapper (PM) is the tool of TDA for this study.
    Microseismic data has great characteristics of big data. Previous studies suggesting
    stage-by-stage microseismic analysis also avoid the limitation of current software, which
    can only process slightly over 10,000 data points. During this study, more TDA
    packages are constantly evolving to handle larger and more complex data such as Betti
    Mapper by Spark.

    PM is a tool by combining topology principles and machine learning methods
    into an integrated data analytic implementation. The high-dimensionality of
    microseismic data practically limits what classical statistical analyses can achieve.
    Machine learning techniques such as dimensionality reduction are required for such
    datasets. Where PM stands out is its ability to retain the raw feature of data set when
    machine-learning algorithm is applied.

    The first portion of the study is to observe the data point relation of microseismic
    data entirely and stage-by-stage. Dividing attributes into location and signal data reveals
    the relation within and between two different data types.

    The main discovery from location data of network is the high density areas are
    tend to be earlier events and could locate where high pressure start to build up, or the
    origins of the fracture networks. Origins that are far apart in the beginning grow into
    each other to result in one (most of the time) or more (rarely more than two) networks.
    The fracture growth with complex directions of extensions can be represented with a
    much simpler, single-directional network. Signal data reveals location-specific data
    quality trends. These trends are hardly visible if attributes are investigated in pairs but
    obvious when mapped altogether. Locational and geological characteristics may be an
    explanation, but this needs further information to prove the observations. In fracture
    growth softwares, these trends will allow researchers to ignore the location of the
    wellbore and focuses at the actual origins of the fracture network. An override including
    discontinuity of the network and confidence of stimulated reservoir volume could be
    manually added to improve the accuracy of the fracture simulation.

    A sensitivity analysis to PM parameters is carried out to test the robustness of the
    method and comparing raw data clustering method to prove the effectiveness and
    benefits of using TDA. TDA is a great method for data preprocesses, analyses, and has
    virtually infinite possibility, but should never be the end of a project. The results from
    PM could be used as input for many other studies.

publication date

  • August 2017