Skip to main content
Mitsunori Ogihara

    Mitsunori Ogihara

    This paper proposes a notion of time complexity in splicing systems. The time complexity of a splicing system at length n is defined to be the smallest integer t such that all the words of the system having length n are produced within t... more
    This paper proposes a notion of time complexity in splicing systems. The time complexity of a splicing system at length n is defined to be the smallest integer t such that all the words of the system having length n are produced within t rounds. For a function t from the set of natural numbers to itself, the class of languages with splicing system time complexity t(n) is denoted by SPLTIME[f(n)]. This paper presents fundamental properties of SPLTIME and explores its relation to classes based on standard computational models, both in terms of upper bounds and in terms of lower bounds. As to upper bounds, it is shown that for any function t(n)SPLTIME[t(n)] is included in 1-NSPACE[t(n)]; i.e., the class of languages accepted by a t(n)-space-bounded non-deterministic Turing machine with one-way input head. Expanding on this result, it is shown that 1-NSPACE[t(n)] is characterized in terms of splicing systems: it is the class of languages accepted by a t(n)-space uniform family of extend...
    We show the following results regarding complete sets.NP-complete sets and PSPACE-complete sets are many-one autoreducible. Complete sets of any level of PH, MODPH, or the Boolean hierarchy over NP are many-one autoreducible. EXP-complete... more
    We show the following results regarding complete sets.NP-complete sets and PSPACE-complete sets are many-one autoreducible. Complete sets of any level of PH, MODPH, or the Boolean hierarchy over NP are many-one autoreducible. EXP-complete sets are many-one mitotic. NEXP-complete sets are weakly many-one mitotic. PSPACE-complete sets are weakly Turing-mitotic. If one-way permutations and quick pseudo-random generators exist, then NP-complete languages are m-mitotic. If there is a tally language in NP ∩ coNP - P, then, for every e > 0, NP-complete sets are not 2n(1+e)-immune. These results solve several of the open questions raised by Buhrman and Torenvliet in their 1994 survey paper on the structure of complete sets.
    Valiant (SIAM J. Comput. 8 (1979) 410-421) showed that the problem of computing the number of simple s-t paths in graphs is #P-complete both in the case of directed graphs and in the case of undirected graphs. Welsh (Complexity: Knots,... more
    Valiant (SIAM J. Comput. 8 (1979) 410-421) showed that the problem of computing the number of simple s-t paths in graphs is #P-complete both in the case of directed graphs and in the case of undirected graphs. Welsh (Complexity: Knots, Colourings and Counting, Cambridge University Press, Cambridge, 1993, p. 17) asked whether the problem of computing the number of self-avoiding walks of a given length in the complete two-dimensional grid is complete for #P1, the tally-version of #P. This paper offers a partial answer to the question of Welsh: it is #P-complete to compute the number of self-avoiding walks of a given length in a subgraph of a two-dimensional grid. Several variations of the problem are also studied and shown to be #P-complete. This paper also studies the problem of computing the number of self-avoiding walks in a subgraph of a hypercube. Similar completeness results are shown for the problem. By scaling the computation time to exponential, it is shown that computing the...
    Convolutional neural networks (CNNs) have been successfully applied on both discriminative and generative modeling for music-related tasks. For a particular task, the trained CNN contains information representing the decision making or... more
    Convolutional neural networks (CNNs) have been successfully applied on both discriminative and generative modeling for music-related tasks. For a particular task, the trained CNN contains information representing the decision making or the abstracting process. One can hope to manipulate existing music based on this 'informed' network and create music with new features corresponding to the knowledge obtained by the network. In this paper, we propose a method to utilize the stored information from a CNN trained on musical genre classification task. The network was composed of three convolutional layers, and was trained to classify five-second song clips into five different genres. After training, randomly selected clips were modified by maximizing the sum of outputs from the network layers. In addition to the potential of such CNNs to produce interesting audio transformation, more information about the network and the original music could be obtained from the analysis of the g...
    Music is not only for entertainment and for pleasure, but has been used for a wide range of purposes due to its social and physiological effects. Traditionally musical information has been retrieved and/or classified based on standard... more
    Music is not only for entertainment and for pleasure, but has been used for a wide range of purposes due to its social and physiological effects. Traditionally musical information has been retrieved and/or classified based on standard reference information, such as the name of the composer and the title of the work etc. These basic pieces information will remain essential, but information retrieval based on these are far from satisfactory. Huron points out that since the preeminent functions of music are social and psychological, the most useful characterization would be based on four types of information: the style, emotion, genre, and similarity [Huron,2000].
    Approximate Nearest Neighbor Search (ANNS) is a fundamental algorithmic problem, with numerous applications in many areas of computer science. Locality-sensitive hashing (LSH) is one of the most popular solution approaches for ANNS. A... more
    Approximate Nearest Neighbor Search (ANNS) is a fundamental algorithmic problem, with numerous applications in many areas of computer science. Locality-sensitive hashing (LSH) is one of the most popular solution approaches for ANNS. A common shortcoming of many LSH schemes is that since they probe only a single bucket in a hash table, they need to use a large number of hash tables to achieve a high query accuracy. For ANNS-L2, a multi-probe scheme was proposed to overcome this drawback by strategically probing multiple buckets in a hash table. In this work, we propose MP-RW-LSH, the first and so far only multi-probe LSH solution to ANNS in L1 distance. Another contribution of this work is to explain why a state-of-the-art ANNS-L1 solution called Cauchy projection LSH (CP-LSH) is fundamentally not suitable for multi-probe extension. We show that MP-RW-LSH uses 15 to 53 times fewer hash tables than CP-LSH for achieving similar query accuracies.
    Deficits in motor movement in children with autism spectrum disorder (ASD) have typically been characterized qualitatively by human observers. Although clinicians have noted the importance of atypical head positioning (e.g. social peering... more
    Deficits in motor movement in children with autism spectrum disorder (ASD) have typically been characterized qualitatively by human observers. Although clinicians have noted the importance of atypical head positioning (e.g. social peering and repetitive head banging) when diagnosing children with ASD, a quantitative understanding of head movement in ASD is lacking. Here, we conduct a quantitative comparison of head movement dynamics in children with and without ASD using automated, person-independent computer-vision based head tracking (Zface). Because children with ASD often exhibit preferential attention to nonsocial versus social stimuli, we investigated whether children with and without ASD differed in their head movement dynamics depending on stimulus sociality. The current study examined differences in head movement dynamics in children with ( = 21) and without ASD ( = 21). Children were video-recorded while watching a 16-min video of social and nonsocial stimuli. Three dimens...
    ABSTRACT Making better use of the cache is important for the modern computer programs and systems. One key step is understanding the data locality. In this article, we investigate one effective and important model of data... more
    ABSTRACT Making better use of the cache is important for the modern computer programs and systems. One key step is understanding the data locality. In this article, we investigate one effective and important model of data locality—reference affinity. This model is superior to previous ones because it can express whole-scale locality in a more accurate and flexible way. Traces collected from different applications are a rich source for the analysis of reference affinity. In this article, we extend strict reference affinity to weak reference affinity ...
    This paper presents the PlanMine sequence mining algorithm to extract patterns of events that predict failures in databases of plan executions. New techniques were needed because previous data mining algorithms were overwhelmed by the... more
    This paper presents the PlanMine sequence mining algorithm to extract patterns of events that predict failures in databases of plan executions. New techniques were needed because previous data mining algorithms were overwhelmed by the staggering number of very frequent, but entirely unpredictive patterns that exist in the plan database. This paper combines several techniques for pruning out unpredictive and redundant patterns which reduce the size of the returned rule set by more than three orders of magnitude. PlanMine has also been fully integrated into two real-world planning systems. We experimentally evaluate the rules discovered by PlanMine, and show that they are extremely useful for understanding and improving plans, as well as for building monitors that raise alarms before failures happen. Keywords: Sequence Mining, Predicting Plan Failures, Plan Monitoring The University of Rochester Computer Science Department supported this work. Supported by NSF grants CCR-9705594, CC...
    Introduction: Fidelity monitoring and feedback are critical components for implementing effective preventive interventions. However, these are also resource intensive for the research team even in efficacy/effectiveness trials, and... more
    Introduction: Fidelity monitoring and feedback are critical components for implementing effective preventive interventions. However, these are also resource intensive for the research team even in efficacy/effectiveness trials, and generally too expensive to maintain when the program is implemented fully within the host institution or community. A reasonable goal is to reduce the cost of obtaining reliable and valid fidelity ratings for behavioral intervention programs by an order of magnitude or more. We present a proof of concept of a computational method that measures fidelity in the Familias Unidas intervention. Familias Unidas is a family-focused prevention intervention targeting externalizing behaviors among Hispanic youth. It is delivered by school counselors in Spanish to parents and in the home in English and Spanish. Our work uses speech analysis, knowledge engineering, and computational linguistics to measure fidelity. Methods/ Results: The existing method for rating fide...
    Abstract Classification algorithms are difficult to apply to sequential examples because there is a vast number of potentially useful features for describing each example. Past work on feature selection has focused on searching the space... more
    Abstract Classification algorithms are difficult to apply to sequential examples because there is a vast number of potentially useful features for describing each example. Past work on feature selection has focused on searching the space of all subsets of features, which is intractable for large feature sets. We adapt sequence mining techniques to aEi as a preprocessor to select features for standard classification algorithms such as Naive Bayes and Winnow. Our experiments on three different datasets show that the features produced ...
    A typical behavioral intervention requires recording the audio or video of the session lead by an intervention agent (facilitator). These recordings offer valuable information for training, supervision, and analysis of intervention... more
    A typical behavioral intervention requires recording the audio or video of the session lead by an intervention agent (facilitator). These recordings offer valuable information for training, supervision, and analysis of intervention impact. Yet, they are costly and resource intensive. During efficacy and effectiveness trials, typically more than 90% of the sessions recordings are not coded for fidelity. During implementation of prevention programs, even less sessions are coded for fidelity, and often the required procedures for recording the session and assessing fidelity becomes unsustainable due to cost and time. We propose a way to use computational methods to analyze fidelity automatically. These methods provide promise that enables local organizations to better implement carefully tested evidence-based programs. Our method analyzes the communication between of the facilitator and at least one individual who is the target of the intervention. We demonstrate how these method can u...
    Abstract We proposed a method to classify songs in the Million Song Dataset according to song genre. Since songs have several data types, we trained sub-classifiers by different types of data. These sub-classifiers are combined using both... more
    Abstract We proposed a method to classify songs in the Million Song Dataset according to song genre. Since songs have several data types, we trained sub-classifiers by different types of data. These sub-classifiers are combined using both classifier authority and classification confidence for a particular instance. In the experiments, the combined classifier surpasses all of these sub-classifiers and the SVM classifier using concatenated vectors from all data types. Finally, the genre labels for the Million Song Dataset are provided.
    Purpose – Library data are often hard to analyze because these data come from unconnected sources, and the data sets can be very large. Furthermore, the desire to protect user privacy has prevented the retention of data that could be used... more
    Purpose – Library data are often hard to analyze because these data come from unconnected sources, and the data sets can be very large. Furthermore, the desire to protect user privacy has prevented the retention of data that could be used to correlate library data to non-library data. The research team used data mining to determine library use patterns and to determine whether library use correlated to students’ grade point average. Design/methodology/approach – A research team collected and analyzed data from the libraries, registrar and human resources. All data sets were uploaded into a single, secure data warehouse, allowing them to be analyzed and correlated. Findings – The analysis revealed patterns of library use by academic department, patterns of book use over 20 years and correlations between library use and grade point average. Research limitations/implications – Analysis of more narrowly defined user populations and collections will help develop targeted outreach efforts...
    Abstract Social tags have been acknowledged as a highly useful resource in retrieving music by moods or topics. However, since social tags are open for labeling, some social tags are inaccurate. In this paper, we present a new framework... more
    Abstract Social tags have been acknowledged as a highly useful resource in retrieving music by moods or topics. However, since social tags are open for labeling, some social tags are inaccurate. In this paper, we present a new framework to identify accurate social tags of songs. In our framework, we first clean and filter music tags. Then we apply an improved hierarchical clustering algorithm to group the tags to build a tag category. Based on the category, we classify music songs using lyrics. In order to extend the semantic ...
    ABSTRACT
    ABSTRACT
    ABSTRACT

    And 210 more