Patents and Publications

Method and a system for minimizing roaming cost in a mobile communication network

US 8112063B2 · Issued Feb 1, 2012

The present invention deals with a method and system for routing a call in a mobile communication network. The method comprises receiving a message by a caller prevailing network corresponding to a caller from a callee home network corresponding to a callee, if the callee is roaming. The message is received in response to the call being initiated by the caller for the callee. The message can comprise a redirection information corresponding to the callee in roaming. The method further comprises solving a predefined criterion for routing the call based on the redirection information in the message and connecting the call based on solution of the predefined criterion.

See patent

Dynamic mixed-initiative dialog generation in speech recognition

US 7941312 B2 · Issued May 10, 2011

Disclosed are a method (500), apparatus (100) and computer program product for generating a mixed-initiative dialog to obtain information for dialog slots. A composite grammar dependent upon a set of unfilled slots is constructed (501). A prompt, dependent upon the a set of unfilled slots, is presented (309) to a user. An utterance is received (301) from the user in response to said prompt. Relevant information is determined based upon the further utterance. One or more said unfilled slots are filled (302) with said relevant information.

See patent

Dialog call-flow optimization

US 7908143B2 · Issued Mar 15, 2011

The present invention is concerned with reorganizing dialog call-flow in the presence of resource constraints. A call-flow has a set of dialogs. The set of grammars in a given call-flow set of dialogs is determined. Each grammar has an associated resource requirement. The resource constraint of the device is also determined. The dialogs are reorganized subject to the device resource constraints not being exceeded by a resultant resource requirement of merged dialogs. The grammars can be split into atomic dialogs before the reorganization is performed. The reorganization includes merging at least two of the dialogs.

See patent

Mobile wireless device adaptation based on abstracted contectual situation of user using near-field communications and information collectors

A mobile wireless device, such as a mobile wireless phone, is adapted based on a user’s current abstracted contextual situation, where the context of a user is determined using devices enabled with near-field communication technology. Dynamic information of a user of a mobile device, such as the identity of his or her current environment, is determined using near-field communication, such as radio-frequency identification (RFID) tags. Static information regarding the user is also determined, where such static information can include the user’s preferences regarding how the mobile device should adapt to certain environments. An abstracted contextual situation of the user is synthesized based on this dynamic and static information. One or more adaptation directives for the mobile device of the user are determined based on the user’s abstracted contextual situation. The adaptation directives are implemented for the mobile device, without user interaction, or by instructing the user to appropriately configure the mobile device.

See patent

Method for fault handling in a co-operative workflow environment

Embodiments herein provide a fault-handling scheme based on forward recovery for cooperative workflow environments. The fault handling scheme relies on the correct placement of transaction scopes and their associated fault and compensation handlers for maintaining correct application semantics, a fault propagation scheme for forwarding faults to a workflow component that has the corresponding fault handler, and a distributed mechanism for collecting data of completed workflow components to facilitate recovering from faults. The fault handling scheme makes use of control flow messages to facilitate compensation of nested transaction scopes (residing in different components). The workflow components are also modified with additional code for aiding with fault propagation and fault recovery.

See patent

Method for organizing semi-structured data into taxonomy, based on tag-separated clustering

A method organizes semi-structured data into a taxonomy, based on Tag-Separated (TS) clustering. The method comprises retrieving documents including the semi-structured data. The semi-structured data comprises structured data including structured data fields and tags, and unstructured data. The method selects a structured attribute type including any of a categorical attribute, a numerical attribute, and a tag associated with annotated text, and an unstructured attribute type including a text attribute. The method clusters the semi-structured data from the retrieved documents into a plurality of clusters based on the selected structured attribute type and the selected unstructured attribute type. For a categorical attribute, each category corresponds to a single cluster. For a numerical attribute, a clustering algorithm clusters numerical data projected onto a range of the numerical attribute. For an annotated text attribute, a monothetic clustering algorithm clusters annotated text data according to tags associated with a vocabulary for the annotated text data.

See patent

Systen and Method for inferring invisible traffic

US 9646253 B2 · Filed May 9, 2017

This disclosure is directed to techniques for inferring traffic information or estimating total volume of traffic/data flowing through a target network/entity, wherein only a partial subset of inferred traffic information or volume of data is available to a predictor entity/network that infers such traffic information. In an embodiment, such partial subset of total traffic can either be made available to the entity/network for inferring and estimating total traffic or such partial data can actually flow through the entity/network.

See patent

METHOD for streaming SVD computation

The present disclosure is directed to techniques for efficient streaming SVD computation. In an embodiment, streaming SVD can be applied for streamed data and/or for streamed processing of data. In another embodiment, the streamed data can include time series data, data in motion, and data at rest, wherein the data at rest can include data from a database or a file and read in an ordered manner. More particularly, the disclosure is directed to an efficient and faster method of computation of streaming SVD for data sets such that errors including reconstruction error and loss of orthogonality are error bounded. The method avoids SVD re-computation of already computed data sets and ensures updates to the SVD model by incorporating only the changes introduced by the new entrant data sets.

See patent

PARTIALLY DECENTRALIZED COMPOSITION OF WEB SERVICES

A partially decentralized composition of web services is performed by distributing the coordination responsibility of the component web services, originally performed at run time by the centralized execution language code, to multiple web service domains. The original software is divided into multiple code partitions and placed among different web service domains. These code partitions invoke one or more component web services and perform the required data transformation applicable to enable calling and returning data from the web services. The partitions may invoke more than one web service. The web service domains containing the code partitions that invoke more than one web services and perform the required data transformation become new coordinator nodes. In constrained data flow environments, to satisfy any data flow constraints, the data is sent from producer to consumer along a path restricted to the nodes eligible to access this data. The code performing the required data transformation is located on the nodes in this path and may span across multiple nodes.

See patent

A METHOD AND SYSTEM FOR SPEECH CLASSIFICATION

The present invention deals with a method and system for classifying at least one speech-user-utterance in a speech classification system to one of a plurality of pre-defined class types. The method comprises transcribing automatically the at least one speech-user-utterance to obtain at least one automatic-transcribed-text and estimating an estimated-transcribed-text corresponding to the at least one automatic-transcribed-text. The method further comprises classifying the at least one speech-user-utterance based on the estimated-transcribed-text. The estimated-transcribed-text is estimated based on at least one statistical model.

See patent

Inferring invisible traffic

A traffic matrix encompassing the entire Internet would be very valuable. Unfortunately, from any given vantage point in the network, most traffic is invisible. In this paper we describe results that hold some promise for this problem. First, we show a new characterization result: traffic matrices (TMs) typically show very low effective rank. This result refers to TMs that are purely spatial (have no temporal component), over a wide range of spatial granularities. Next, we define an inference problem whose solution allows one to infer invisible TM elements. This problem relies crucially on an atomicity property we define. Finally, we show example solutions of this inference problem via two different methods: regularized regression and matrix completion. The example consists of an AS inferring the amount of invisible traffic passing between other pairs of ASes. Using this example we illustrate the accuracy of the methods as a function of spatial granularity.

Show Publication

Handling Faults in Decentralized Orchestration of Composite Web Services

Composite web services can be orchestrated in a decentralized manner by breaking down the original service specification into a set of partitions and executing them on a distributed infrastructure. The infrastructure consists of multiple service engines communicating with each other over asynchronous messaging. Decentralized orchestration yields performance benefits by exploiting concurrency and reducing the data on the network. Further, decentralized orchestration may be necessary to orchestrate certain composite web services due to privacy and data flow constraints. However, decentralized orchestration also results in additional complexity due to absence of a centralized global state, and overlapping or different life cycles of the various partitions. This makes handling of faults arising from composite service partitions or from the failure of component web services, a challenging task.
In this paper we propose a mechanism for handling faults in decentralized orchestration of composite web services. The mechanism includes a strategy for placement of fault handlers and compensation handlers, and schemes for fault propagation and fault recovery. The mechanism is designed to maintain the semantics of the original specification while ensuring minimal overheads.

Show Publication

Reusable Dialog Component for Content Selection from Large data Sets

Inherent limitations of spoken language interfaces make the task of information access from large data sets difficult. Providing a dialog component which can be easily configured to access information from such data sets is immensely useful. Such component would ease and expedite the development of speech applications. We propose a dialog component which makes use of user preferences, user profile and utterance history to select relevant information from large data sets. Content presentation is also determined by user preferences and utterance history. The evaluation shows the effectiveness of the technique and effect of user profile in accessing information. It also demonstrates reusability of component to access different datasets.

Show Publication

An architecture for pluggable disambiguation mechanism for RDC based voice applications.

Building speech-based conversational systems involves the development of several speech specific control mechanisms such as validation, confirmation, disambiguation in addition to the actual application call-flow. We present an architecture for pluggable disambiguation mechanism for speech based conversational systems. The architecture provides a mechanism to decouple the disambiguation from the voice application. Several disambiguation strategies can be designed to disambiguate a user input. These strategies can be applied to the user input in a seamless manner. The disambiguated value from one component can be passed on to another component for further disambiguation. We implement the architecture by using it in the Reusable Dialog Component framework. Several illustrative examples are presented to highlight the effectiveness of having a pluggable disambiguation mechanism for voice applications.

Show Publications

Information Retrieval and Knowledge Discovery utilizing a BioMedical Patent Semantic Web

Before undertaking new biomedical research, identifying concepts that have already been patented is essential. A traditional keyword-based search on patent databases may not be sufficient to retrieve all the relevant information, especially for the biomedical domain. This paper presents BioPatentMiner, a system that facilitates information retrieval and knowledge discovery from biomedical patents. The system first identifies biological terms and relations from the patents and then integrates the information from the patents with knowledge from biomedical ontologies to create a semantic Web. Besides keyword search and queries linking the properties specified by one or more RDF triples, the system can discover semantic associations between the Web resources. The system also determines the importance of the resources to rank the results of a search and prevent information overload while determining the semantic associations.

Show Publication

SAMVAAD: Speech Applications Made Viable for Access-Anywhere Devices

The proliferation of pervasive devices has stimulated the development of applications that support ubiquitous access via multiple modalities. Since the processing capabilities of pervasive devices differ vastly, device-specific application adaptation becomes essential. We address the problem of speech application adaptation by dialog call-flow reorganisation for pervasive devices with different memory constraints. Given an atomic dialog call-flow A and device memory size m, we present optimal deterministic algorithms, RESEQUENCE and BALANCE-TREE, which minimise the number of questions in the reorganised output call-flow Am. Algorithms MASQ and MATREE produce Cm, minimally distant from input call-flow Am while accommodating the memory constraint m. These two minimisation criteria are capable of capturing various usability requirements important in dialog call-flow design. The following observation forms the cornerstone of all the algorithms in this paper: Two grammars g1 and g2 comprising of |g1| and |g2| elements respectively can be merged into a single grammar g = g1 × g2 having |g1|·|g2| elements for the sequential case, and g = g1 + g2 having |g1|+|g2| elements for the tree case. Device-speciific considerations lead us to introduce the concept of an -characterisation of a call-flow, defined as the set of pairs {(mi,qi)| ∈ N}, where qi is the minimum number of questions required for memory size mi. Each call-flow has a unique, device-independent signature in its -characterisation – a measure of its adaptability. We present SAMVAAD, a system that implements these algorithms on call-flows authored in VXML containing SRGS grammars. The system was tested on an IBM voice browser using a sample airline reservation system call-flow reorganised for memories ranging from 64 MB to 210 KB. We ran an experiment with 14 users to obtain feedback on the usability of the adapted call-flows.

Show Publication

Reusable Dialog Component Framework for Rapid Voice Application Development

Voice application development requires specialized speech related skills besides the general programming ability. Encapsulating the speech specific behavior and complexities in prepackaged, configurable User Interface (UI) components will ease and expedite the voice application development. These components can be used across applications and are called as Reusable Dialog Components (RDCs). In this paper we propose a programming model and the framework for developing reusable dialog components. Our framework facilitates the development of voice applications via the encapsulation of interaction mechanisms, the encapsulation of best-of-breed practices (ie. grammars, prompts, and configuration parameters), a modular design and through pluggable dialog management strategies. The framework extends the standard J2EE/JSP based programming model to make it suitable for voice applications.

Show Publication

METHOD for streaming SVD computation

Show Publication

Text-Based Summarization and Visualization of Gene Clusters

We present a system named MedSummarizer which uses biomedical literature information to assign biological meaning to a cluster of genes. Using relevant PubMed citations, it creates a ranked list of important biological concepts that describes the gene list. Further, based on the assigned concepts, it computes similarity between each pair of genes and displays this using a graph based visualization technique. The system allows use of human curated index (e.g. Mesh terms) as well as automatic annotations derived from free-text. We compare the results obtained using these two types of terms.

Show Publication

Information extraction from biomedical literature: methodology, evaluation and an application

Journals and conference proceedings represent the dominant mechanisms of reporting new biomedical results. The unstructured nature of such publications makes it difficult to utilize data mining or automated knowledge discovery techniques. Annotation (or markup) of these unstructured documents represents the first step in making these documents machine analyzable. In this paper we first present a system called BioAnnotator for identifying and annotating biological terms in documents. BioAnnotator uses domain based dictionary look-up for recognizing known terms and a rule engine for discovering new terms. The combination and dictionary look-up and rules result in good performance (87% precision and 94% recall on the GENIA 1.1 corpus for extracting general biological terms based on an approximate matching criterion). To demonstrate the subsequent mining and knowledge discovery activities that are made feasible by BioAnnotator, we also present a system called MedSummarizer that uses the extracted terms to identify the common concepts in a given group of genes.

Show Publication

A System for Knowledge Management in Bioinformatics

The emerging biochip technology has made it possible to simultaneously study expression (activity level) of thousands of genes or proteins in a single experiment in the laboratory. However, in order to extract relevant biological knowledge from the biochip experimental data, it is critical not only to analyze the experimental data, but also to cross-reference and correlate these large volumes of data with information available in external biological databases accessible online. We address this problem in a comprehensive system for knowledge management in bioinformatics called e2e. To the biologist or biological applications, e2e exposes a common semantic view of inter-relationship among biological concepts in the form of an XML representation called eXpressML, while internally, it can use any data integration solution to retrieve data and return results corresponding to the semantic view. We have implemented an e2e prototype that enables a biologist to analyze her gene expression data in GEML or from a public site like Stanford, and discover knowledge through operations like querying on relevant annotated data represented in eXpressML using pathways data from KEGG, publication data from Medline and protein data from SWISS-PROT.

Show Publication

MedMesh Summarizer: Text Mining for Gene Clusters

Show Publications

A Common Data Representation for Organizing and Managing Annotations of Biochip Expression Data

This paper describes a systematic method of representation for data related to “biochip”experiments. The data is either a) gene or protein expression data as obtained from biochipexperiments or b) information related to genes and proteins of interest in the biochipexperiment, which has been extracted from heterogeneous data sources. The goal is tosynthesize information from disparate types of biological data. The objective of this paper is todefine a common representation for the disparate types of biological data in a manner thatallows modelling, querying, and annotating of the biochip experimental data.

See patent

Bioinformatics for Microarrays

Microarrays (or biochips) is perhaps one of the most exciting devel-opments in bioinformatics research. The emerging biochip technologyhas made it possible to simultaneously study expression (activity level)of thousands of genes or proteins in a single experiment in the labora-tory. However, in order to extract relevant biological knowledge from thebiochip experimental data, it is critical not only to analyze the experimen-tal data, but also to cross-reference and correlate these large volumes ofdata with information available in external biological databases accessibleonline.We describe a comprehensive system for knowledge management inbioinformatics called ¾ in which data generated by the biochip experi-ments can be analyzed for emerging patterns among groups of genes withadditional insights from related analyses like pathway scores, sequencesimilarity, literature text summarization, etc. To the biologist or biologi-cal applications, ¾ exposes a common semantic view of inter-relationshipamong biological concepts in the form of an XML representation calledeXpressML. Internally, ¾ can use any data integration solution (likeDiscoveryLink, Kleisli or natively XML-based) to retrieve data and re-turn results corresponding to the semantic view. We have implementedan e2e prototype that demonstrates our framework by allowing a biolo-gist to analyze her gene expression data in GEML or from a public sitelike Stanford, and discover knowledge through operations like queryingon relevant annotated data represented in eXpressML using pathwaysdata from KEGG, publication data from Medline and protein data fromSWISS-PROT.

Show Publications