In distributed learning, the goal is to perform a learning task over data distributed across multiple nodes with minimal (expensive) communication. Prior work (Daume III et al., 2012) proposes a general model that bounds the communication required for learning classifiers while allowing for $\eps$ training error on linearly separable data adversarially distributed across nodes.
In this work, we develop key improvements and extensions to this basic model. Our first result is a two-party multiplicative-weight-update based protocol that uses $O(d^2 \log{1/\eps})$ words of communication to classify distributed data in arbitrary dimension $d$, $\eps$-optimally. This readily extends to classification over $k$ nodes with $O(kd^2 \log{1/\eps})$ words of communication. Our proposed protocol is simple to implement and is considerably more efficient than baselines compared, as demonstrated by our empirical results.
In addition, we illustrate general algorithm design paradigms for doing efficient learning over distributed data. We show how to solve fixed-dimensional and high dimensional linear programming efficiently in a distributed setting where constraints may be distributed across nodes. Since many learning problems can be viewed as convex optimization problems where constraints are generated by individual points, this models many typical distributed learning scenarios. Our techniques make use of a novel connection from multipass streaming, as well as adapting the multiplicative-weight-update framework more generally to a distributed setting. As a consequence, our methods extend to the wide range of problems solvable using these techniques.
We consider the problem of learning classifiers for labeled data that has been distributed across several nodes. Our goal is to find a single classifier, with small approximation error, across all datasets while minimizing the communication between nodes. This setting models real-world communication bottlenecks in the processing of massive distributed datasets. We present several very general sampling-based solutions as well as some two-way protocols which have a provable exponential speed-up over any one-way protocol. We focus on core problems for noiseless data distributed across two or more nodes. The techniques we introduce are reminiscent of active learning, but rather than actively probing labels, nodes actively communicate with each other, each node simultaneously learning the important data from another node.
In this paper, we harness the synergy between two important
learning paradigms, namely, active learning and domain adaptation. We
show how active learning in a target domain can leverage information
from a different but related source domain. Our proposed framework, Active
Learning Domain Adapted (Alda), uses source domain knowledge
to transfer information that facilitates active learning in the target domain.
We propose two variants of Alda: a batch B-Alda and an online
O-Alda. Empirical comparisons with numerous baselines on real-world
datasets establish the efficacy of the proposed methods.
[C] Online Learning of Multiple Tasks and Their Relationships
by Avishek Saha, Piyush Rai, Hal Daumé III, Suresh Venkatasubramanian
at International Conference on Artificial Intelligence and Statistics (AISTATS), 2011.
We propose an Online MultiTask Learning (Omtl) framework which simultaneously
learns the task weight vectors as well as the task relatedness adaptively from the
data. Our work is in contrast with prior work on online multitask learning which
assumes fixed task relatedness, a priori. Furthermore, whereas prior work in such
settings assume only positively correlated tasks, our framework can capture negative
correlations as well. Our proposed framework learns the task relationship matrix
by framing the objective function as a Bregman divergence minimization problem
for positive definite matrices. Subsequently, we exploit this adaptively learned
task-relationship matrix to select the most informative samples in an online multitask
active learning setting. Experimental results on a number of real-world datasets
and comparisons with numerous baselines establish the efficacy of our proposed approach.
*[C] Co-regularization Based Semi-supervised Domain Adaptation
by Hal Daumé III, Abhishek Kumar, Avishek Saha
at Advances in Neural Information Processing Systems (NIPS), 2010.
This paper presents a co-regularization based approach to
semi-supervised domain adaptation. Our proposed approach (EA++) builds
on the notion of augmented space (introduced in EASYADAPT (EA) [1])
and harnesses unlabeled data in target domain to further enable the
transfer of information from source to target. This semi-supervised
approach to domain adaptation is extremely simple to implement and can
be applied as a pre-processing step to any supervised learner. Our
theoretical analysis (in terms of Rademacher complexity) of EA and
EA++ show that the hypothesis class of EA++ has lower complexity
(compared to EA) and hence results in tighter generalization bounds.
Experimental results on sentiment analysis tasks reinforce our
theoretical findings and demonstrate the efficacy of the proposed
method when compared to EA as well as a few other baseline approaches.
In this paper, we propose an online multitask learning framework where the weight
vectors are updated in an adaptive fashion based on inter-task relatedness. Our work
in is contrast with earlier work on online multitask learning
where the authors use a fixed interaction matrix of tasks to derive (fixed) update rules for all the tasks.
In this work, we propose to update this interaction matrix itself in an adaptive fashion so that the
weight vector updates are no longer fixed but are instead adaptive. Our framework can be extended
to an active learning setting where the informativeness of an incoming instance across all the tasks can be
evaluated using this adaptive interaction matrix. Empirical results on standardized datasets
show improved performance in terms of accuracy, label complexity and number of mistakes made.
*[W] Frustratingly Easy Semi-supervised Domain Adaptation
by Hal Daumé III, Abhishek Kumar, Avishek Saha
at ACL 2010 Workshop on Domain Adaptation for Natural Language Processing (DANLP), 2010.
In this work, we propose a semi-supervised extension to a well-known
supervised domain adaptation approach (EA)~\cite{daume07easyadapt}.
Our proposed approach (EA++) builds on the notion of augmented
space (introduced in EA) and harnesses unlabeled data in target
domain to ameliorate the transfer of information from \emph{source} to \emph{target}.
This semi-supervised approach to domain adaptation is extremely simple to implement, and can
be applied as a pre-processing step to any supervised learner.
Experimental results on sequential labeling tasks demonstrate the efficacy
of the proposed method.
In this work, we show how active learning in some (target) domain can leverage information
from a different but related (source) domain. We present an algorithm that harnesses
the source domain data to learn the best possible initializer hypothesis for doing active
learning in the target domain, resulting in improved label complexity. We also present a
variant of this algorithm which additionally uses the domain divergence information to selectively
query the most informative points in the target domain, leading to further reductions
in label complexity. Experimental results on a variety of datasets establish the efficacy
of the proposed methods.
We study sequential dependencies that express the semantics of data with ordered domains
and help identify quality problems with such data. Given an interval g, we write {X}->(g){Y} to
denote that the difference between the Y-attribute values of any two consecutive records,
when sorted on X, must be in g. For example, {time}->(0,\inf){sequence number} indicates
that sequence numbers are strictly increasing over time, whereas {sequence number}->(4,5){time}
means that the time "gaps" between consecutive sequence numbers are between 4 and 5.
Sequential dependencies express relationships between ordered attributes,
and identify missing (gaps too large), extraneous (gaps too small) and out-of-order data.
To make sequential dependencies applicable to real-world data, we relax their requirements
and allow them to hold approximately (with some exceptions) and conditionally (on various subsets
of the data). This paper proposes the notion of conditional approximate sequential dependencies
and provides an efficient framework for discovering pattern tableaux, which are compact representations
of the subsets of the data (i.e., ranges of values of the ordered attributes) that satisfy the
underlying dependency. We present analyses of our proposed algorithms, and experiments on real
data demonstrating the efficiency and utility of our framework.
When merging data from various sources, it is often the case that small variations in data format
and interpretation cause traditional functional dependencies (FDs) to be violated, without there
being an intrinsic violation of semantics. Examples include differing address formats, or different
reported latitude/longitudes for a given address. In this paper, we define metric functional dependencies,
which strictly generalize traditional FDs by allowing small differences (controlled by a metric) in values
of the consequent attribute of an FD. We present efficient algorithms for the verification problem:
determining whether a given metric FD holds for a given relation. We experimentally demonstrate the
validity and efficiency of our approach on various data sets that lie in multidimensional spaces.
[J] A Neighborhood Elimination Approach for Block Matching in Motion Estimation
by Avishek Saha, Jayanta Mukherjee, Shamik Sural
at Signal Processing: Image Communication (To appear), 2011.
A new algorithm has been proposed for reducing the number of search locations in blockmatching
based motion estimation. This algorithm uses spatial correlation to eliminate neighboring
blocks having low probability of being best match to the candidate block. Existing fast BMAs use a fixed
pattern to find the motion vector of a macroblock. On the contrary, the proposed algorithm is
independent of any such initially fixed search patterns. The decision to eliminate the neighborhood is
taken dynamically based on a preset threshold Th. The extent to which the neighborhood can be
eliminated is configured using the shift parameter (\delta). Thus, reduction in the number of search
positions changes dynamically depending on input content. Experiments have been carried out for
comparing the performance of the proposed algorithm with other existing BMAs. In addition, an
Adaptive Neighborhood Elimination Algorithm (ANEA) has been proposed whereby the Th and \delta
parameters are updated adaptively.
[J] New pixel-decimation patterns for block matching in motion estimation
by Avishek Saha, Jayanta Mukherjee, Shamik Sural
in Signal Processing: Image Communication, Vol. 23, No. 10, pp. 725-738, Nov 2008.
This paper presents a boundary-based approach towards pixel decimation with applications in
block-matching algorithms (BMAs). The proposed approach is based on the observation that new
objects usually enter macroblocks (MBs) through their boundaries. The MBs are selected based
on boundary region matching only. The boundary-based patterns can be used to speed up motion
estimation with marginal loss in image quality. Different decimation levels for image quality
trade-off with computational power have been presented. The mathematical intuition in support
of the proposed patterns has been discussed. Apart from the boundary-based approach, the novelty
in our contribution also lies in performing a genetic algorithm (GA)-based search to find optimal
M-length patterns in an NxN block. The resultant patterns are found to have better values of spatial
homogeneity and directional coverage metrics, as compared to the recently proposed N-queen decimation
lattices. Subsequently, we obtain new pixel-decimation patterns by combining the proposed boundary-based
patterns with N-queen patterns and the GA-based patterns. Experimental results demonstrate considerably
improved coding efficiency and comparable prediction quality of these new patterns as compared to existing
decimation lattices.
[C] Toward optimal pixel decimation patterns for block matching in motion estimation
by Avishek Saha
in Proceedings of the 15th International Conference on Advanced Computing and Communications (ADCOM), IEEE, pp. 635-640, 2007.
In this paper, we present some results on the use of N-queen sub-sampling lattices for motion estimation
in H.264. For MPEG-4, N-queen has been shown to give better results compared to other existing pixel
decimation lattices in terms of spatial homogeneity and directional coverage. We aim to develop a generalized
algorithm to select an M-length pattern from an NxN block such that the selected pattern is optimal with respect
to the aforementioned metrics of spatial homogeneity and directional coverage. In the process, we observe and
present a few interesting pixel decimation patterns that might be useful for the purpose of motion estimation.
[J] SKIP Prediction for Fast Rate Distortion Optimization in H.264
by Avishek Saha, Kallol Mallick, Jayanta Mukherjee, Shamik Sural
in IEEE Transactions on Consumer Electronics, Vol. 53, No. 3, pp. 1153-1160, Aug 2007.
In H.264, the optimal coding mode for each macroblock (MB) is selected by exhaustively searching
all MB modes in the multiple reference frames. The exhaustive mode search has a very high computational
complexity. This paper proposes three different approaches along with their combination to reduce the
complexity of the rate-distortion optimized mode decision process in H.264. These approaches are based
on transform-domain properties and spatio-temporal correlation in video sequences. Experimental results
demonstrate that the proposed methods achieve around 5-7 times improvement in average speedup with good
prediction quality and comparable bitrate.
[C] Toward memory efficient design of video encoders for multimedia applications
by Avishek Saha, Santosh Ghosh, Shamik Sural, Jayanta Mukherjee
in Proceedings of IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pp. 453-458, 2007.
In this paper, we suggest a scheme to reduce memory requirement of the Full Search Block Matching
(FSBM) algorithm. Next, we explore the memory design space of embedded multimedia applications and
suggest a speed-energy optimized cache hierarchy.
[C] Approximate SAD computation for real-time low-power video encoders
by Avishek Saha, Shamik Sural, Jayanta Mukherjee
in Proceedings of IEE/IET International Conference on Visual Information Engineering (VIE), pp. 207-212, 2006.
This paper presents a heuristic approach to reduce the computational power requirement of SAD cost
function used in the Motion Estimation phase of video encoding. The proposed technique selects the
best matched macroblock based on partial SAD sums. This results in lower power consumption with
marginal loss in image quality. Different levels to trade-off image quality with computational power
have also been proposed. The mathematical intuition in support of the heuristics is discussed. Our
approach shows good performance, especially at low bitrates.
[C] Memory design and exploration for low-power embedded applications: A case study on hash algorithms
by Debojyoti Bhattacharya, Avishek Saha
in Proceedings of the 15th International Conference on Advanced Computing and Communications (ADCOM), IEEE, pp. 479-484, 2007.
Constraints imposed on various resources in embedded computing make it a challenging design space. One such
important constraint is memory. Proper cache design can overcome this memory bottleneck. In our paper, we propose
a methodology for cache design space exploration specific to cryptographic hash functions. The proposed methodology
finds a speed-power optimized cache configuration. We also describe the experimental procedure towards formulation
of the proposed exploration algorithm. Experiments are performed on two cryptographic hash functions,
namely, SHA-1 and MD5. Our approach tries to reduce the exploration search space and hence is better than
traditional exhaustive search.
[C] A speed-area optimization of full search block matching hardware with applications in high-definition tvs (HDTV)
by Avishek Saha, Santosh Ghosh
in Proceedings of High Performance Computing (HiPC), LNCS, Vol. 4873, pp. 83-94, 2007.
HDTV based applications require FSBM to maintain its significantly higher resolution than traditional
broadcasting formats (NTSC, SECAM, PAL). This paper proposes some techniques to increase the speed and
reduce the area requirements of an FSBM hardware. These techniques are based on modifications of the
Sum-of-Absolute-Differences (SAD) computation and the MacroBlock (MB) searching strategy. The design of
an FSBM architecture based on the proposed approaches has also been outlined. The highlight of the proposed
architecture is its split pipelined design to facilitate parallel processing of macroblocks (MBs) in the
initial stages. The proposed hardware has high throughput, low silicon area and compares favorably with
other existing FPGA architectures.
[C] Speed-area optimized fpga implementation for full search block matching
by Santosh Ghosh, Avishek Saha
in Proceedings of IEEE International Conference on Computer Design (ICCD), pp. 13-18, 2007.
This paper presents an FPGA based hardware design for Full Search Block Matching (FSBM) based Motion
Estimation (ME) in video compression. The significantly higher resolution of HDTV based applications
is achieved by using FSBM based ME. The proposed architecture uses a modification of the Sum-of-Absolute-Differences
(SAD) computation in FSBM such that the total number of additions/subtraction operations is drastically
reduced. This successfully optimizes the conflicting design requirements of high throughput and small
silicon area. Comparison results demonstrate the superior performance of our architecture. Finally, the
design of a reconfigurable block matching hardware has been discussed.
[C] Data model of echocardiogram video for content based retrieval
by Aditi Roy, V.Pallavi, Avishek Saha, Jayanta Mukhopadhyay, A.K. Majumdar, Shamik Sural
in Proceedings of the Indian Conference on Medical Informatics and Tele-Medicine (ICMIT), 2006.
In this work we propose a new approach for modeling and management of echocardiogram video data.
We introduce a hierarchical state-based model for representing an echo video using objects present
and their dynamic behaviors. At first the echo video is segmented based on 'view', which describes
the presence of objects. Object behavior is described by states and state transition using state
transition diagram. This diagram is used to partition each view segment into states. We apply a novel
technique for detecting the states using Sweep M-mode. For efficient retrieval of information, we propose
indexes based on views and states of objects. The proposed model thus helps to store information about
similar types of video data in a single database schema and supports content based retrieval of video data.
[J] A Neighborhood Elimination Approach for Block Matching in Motion Estimation
by Avishek Saha, Jayanta Mukherjee, Shamik Sural
at Signal Processing: Image Communication (To appear), 2011.
A new algorithm has been proposed for reducing the number of search locations in blockmatching
based motion estimation. This algorithm uses spatial correlation to eliminate neighboring
blocks having low probability of being best match to the candidate block. Existing fast BMAs use a fixed
pattern to find the motion vector of a macroblock. On the contrary, the proposed algorithm is
independent of any such initially fixed search patterns. The decision to eliminate the neighborhood is
taken dynamically based on a preset threshold Th. The extent to which the neighborhood can be
eliminated is configured using the shift parameter (\delta). Thus, reduction in the number of search
positions changes dynamically depending on input content. Experiments have been carried out for
comparing the performance of the proposed algorithm with other existing BMAs. In addition, an
Adaptive Neighborhood Elimination Algorithm (ANEA) has been proposed whereby the Th and \delta
parameters are updated adaptively.
[J] New pixel-decimation patterns for block matching in motion estimation
by Avishek Saha, Jayanta Mukherjee, Shamik Sural
in Signal Processing: Image Communication, Vol. 23, No. 10, pp. 725-738, Nov 2008.
This paper presents a boundary-based approach towards pixel decimation with applications in
block-matching algorithms (BMAs). The proposed approach is based on the observation that new
objects usually enter macroblocks (MBs) through their boundaries. The MBs are selected based
on boundary region matching only. The boundary-based patterns can be used to speed up motion
estimation with marginal loss in image quality. Different decimation levels for image quality
trade-off with computational power have been presented. The mathematical intuition in support
of the proposed patterns has been discussed. Apart from the boundary-based approach, the novelty
in our contribution also lies in performing a genetic algorithm (GA)-based search to find optimal
M-length patterns in an NxN block. The resultant patterns are found to have better values of spatial
homogeneity and directional coverage metrics, as compared to the recently proposed N-queen decimation
lattices. Subsequently, we obtain new pixel-decimation patterns by combining the proposed boundary-based
patterns with N-queen patterns and the GA-based patterns. Experimental results demonstrate considerably
improved coding efficiency and comparable prediction quality of these new patterns as compared to existing
decimation lattices.
[J] SKIP Prediction for Fast Rate Distortion Optimization in H.264
by Avishek Saha, Kallol Mallick, Jayanta Mukherjee, Shamik Sural
in IEEE Transactions on Consumer Electronics, Vol. 53, No. 3, pp. 1153-1160, Aug 2007.
In H.264, the optimal coding mode for each macroblock (MB) is selected by exhaustively searching
all MB modes in the multiple reference frames. The exhaustive mode search has a very high computational
complexity. This paper proposes three different approaches along with their combination to reduce the
complexity of the rate-distortion optimized mode decision process in H.264. These approaches are based
on transform-domain properties and spatio-temporal correlation in video sequences. Experimental results
demonstrate that the proposed methods achieve around 5-7 times improvement in average speedup with good
prediction quality and comparable bitrate.
In distributed learning, the goal is to perform a learning task over data distributed across multiple nodes with minimal (expensive) communication. Prior work (Daume III et al., 2012) proposes a general model that bounds the communication required for learning classifiers while allowing for $\eps$ training error on linearly separable data adversarially distributed across nodes.
In this work, we develop key improvements and extensions to this basic model. Our first result is a two-party multiplicative-weight-update based protocol that uses $O(d^2 \log{1/\eps})$ words of communication to classify distributed data in arbitrary dimension $d$, $\eps$-optimally. This readily extends to classification over $k$ nodes with $O(kd^2 \log{1/\eps})$ words of communication. Our proposed protocol is simple to implement and is considerably more efficient than baselines compared, as demonstrated by our empirical results.
In addition, we illustrate general algorithm design paradigms for doing efficient learning over distributed data. We show how to solve fixed-dimensional and high dimensional linear programming efficiently in a distributed setting where constraints may be distributed across nodes. Since many learning problems can be viewed as convex optimization problems where constraints are generated by individual points, this models many typical distributed learning scenarios. Our techniques make use of a novel connection from multipass streaming, as well as adapting the multiplicative-weight-update framework more generally to a distributed setting. As a consequence, our methods extend to the wide range of problems solvable using these techniques.
We consider the problem of learning classifiers for labeled data that has been distributed across several nodes. Our goal is to find a single classifier, with small approximation error, across all datasets while minimizing the communication between nodes. This setting models real-world communication bottlenecks in the processing of massive distributed datasets. We present several very general sampling-based solutions as well as some two-way protocols which have a provable exponential speed-up over any one-way protocol. We focus on core problems for noiseless data distributed across two or more nodes. The techniques we introduce are reminiscent of active learning, but rather than actively probing labels, nodes actively communicate with each other, each node simultaneously learning the important data from another node.
In this paper, we harness the synergy between two important
learning paradigms, namely, active learning and domain adaptation. We
show how active learning in a target domain can leverage information
from a different but related source domain. Our proposed framework, Active
Learning Domain Adapted (Alda), uses source domain knowledge
to transfer information that facilitates active learning in the target domain.
We propose two variants of Alda: a batch B-Alda and an online
O-Alda. Empirical comparisons with numerous baselines on real-world
datasets establish the efficacy of the proposed methods.
[C] Online Learning of Multiple Tasks and Their Relationships
by Avishek Saha, Piyush Rai, Hal Daumé III, Suresh Venkatasubramanian
at International Conference on Artificial Intelligence and Statistics (AISTATS), 2011.
We propose an Online MultiTask Learning (Omtl) framework which simultaneously
learns the task weight vectors as well as the task relatedness adaptively from the
data. Our work is in contrast with prior work on online multitask learning which
assumes fixed task relatedness, a priori. Furthermore, whereas prior work in such
settings assume only positively correlated tasks, our framework can capture negative
correlations as well. Our proposed framework learns the task relationship matrix
by framing the objective function as a Bregman divergence minimization problem
for positive definite matrices. Subsequently, we exploit this adaptively learned
task-relationship matrix to select the most informative samples in an online multitask
active learning setting. Experimental results on a number of real-world datasets
and comparisons with numerous baselines establish the efficacy of our proposed approach.
*[C] Co-regularization Based Semi-supervised Domain Adaptation
by Hal Daumé III, Abhishek Kumar, Avishek Saha
at Advances in Neural Information Processing Systems (NIPS), 2010.
This paper presents a co-regularization based approach to
semi-supervised domain adaptation. Our proposed approach (EA++) builds
on the notion of augmented space (introduced in EASYADAPT (EA) [1])
and harnesses unlabeled data in target domain to further enable the
transfer of information from source to target. This semi-supervised
approach to domain adaptation is extremely simple to implement and can
be applied as a pre-processing step to any supervised learner. Our
theoretical analysis (in terms of Rademacher complexity) of EA and
EA++ show that the hypothesis class of EA++ has lower complexity
(compared to EA) and hence results in tighter generalization bounds.
Experimental results on sentiment analysis tasks reinforce our
theoretical findings and demonstrate the efficacy of the proposed
method when compared to EA as well as a few other baseline approaches.
We study sequential dependencies that express the semantics of data with ordered domains
and help identify quality problems with such data. Given an interval g, we write {X}->(g){Y} to
denote that the difference between the Y-attribute values of any two consecutive records,
when sorted on X, must be in g. For example, {time}->(0,\inf){sequence number} indicates
that sequence numbers are strictly increasing over time, whereas {sequence number}->(4,5){time}
means that the time "gaps" between consecutive sequence numbers are between 4 and 5.
Sequential dependencies express relationships between ordered attributes,
and identify missing (gaps too large), extraneous (gaps too small) and out-of-order data.
To make sequential dependencies applicable to real-world data, we relax their requirements
and allow them to hold approximately (with some exceptions) and conditionally (on various subsets
of the data). This paper proposes the notion of conditional approximate sequential dependencies
and provides an efficient framework for discovering pattern tableaux, which are compact representations
of the subsets of the data (i.e., ranges of values of the ordered attributes) that satisfy the
underlying dependency. We present analyses of our proposed algorithms, and experiments on real
data demonstrating the efficiency and utility of our framework.
When merging data from various sources, it is often the case that small variations in data format
and interpretation cause traditional functional dependencies (FDs) to be violated, without there
being an intrinsic violation of semantics. Examples include differing address formats, or different
reported latitude/longitudes for a given address. In this paper, we define metric functional dependencies,
which strictly generalize traditional FDs by allowing small differences (controlled by a metric) in values
of the consequent attribute of an FD. We present efficient algorithms for the verification problem:
determining whether a given metric FD holds for a given relation. We experimentally demonstrate the
validity and efficiency of our approach on various data sets that lie in multidimensional spaces.
[C] Memory design and exploration for low-power embedded applications: A case study on hash algorithms
by Debojyoti Bhattacharya, Avishek Saha
in Proceedings of the 15th International Conference on Advanced Computing and Communications (ADCOM), IEEE, pp. 479-484, 2007.
Constraints imposed on various resources in embedded computing make it a challenging design space. One such
important constraint is memory. Proper cache design can overcome this memory bottleneck. In our paper, we propose
a methodology for cache design space exploration specific to cryptographic hash functions. The proposed methodology
finds a speed-power optimized cache configuration. We also describe the experimental procedure towards formulation
of the proposed exploration algorithm. Experiments are performed on two cryptographic hash functions,
namely, SHA-1 and MD5. Our approach tries to reduce the exploration search space and hence is better than
traditional exhaustive search.
[C] Toward optimal pixel decimation patterns for block matching in motion estimation
by Avishek Saha
in Proceedings of the 15th International Conference on Advanced Computing and Communications (ADCOM), IEEE, pp. 635-640, 2007.
In this paper, we present some results on the use of N-queen sub-sampling lattices for motion estimation
in H.264. For MPEG-4, N-queen has been shown to give better results compared to other existing pixel
decimation lattices in terms of spatial homogeneity and directional coverage. We aim to develop a generalized
algorithm to select an M-length pattern from an NxN block such that the selected pattern is optimal with respect
to the aforementioned metrics of spatial homogeneity and directional coverage. In the process, we observe and
present a few interesting pixel decimation patterns that might be useful for the purpose of motion estimation.
[C] A speed-area optimization of full search block matching hardware with applications in high-definition tvs (HDTV)
by Avishek Saha, Santosh Ghosh
in Proceedings of High Performance Computing (HiPC), LNCS, Vol. 4873, pp. 83-94, 2007.
HDTV based applications require FSBM to maintain its significantly higher resolution than traditional
broadcasting formats (NTSC, SECAM, PAL). This paper proposes some techniques to increase the speed and
reduce the area requirements of an FSBM hardware. These techniques are based on modifications of the
Sum-of-Absolute-Differences (SAD) computation and the MacroBlock (MB) searching strategy. The design of
an FSBM architecture based on the proposed approaches has also been outlined. The highlight of the proposed
architecture is its split pipelined design to facilitate parallel processing of macroblocks (MBs) in the
initial stages. The proposed hardware has high throughput, low silicon area and compares favorably with
other existing FPGA architectures.
[C] Speed-area optimized fpga implementation for full search block matching
by Santosh Ghosh, Avishek Saha
in Proceedings of IEEE International Conference on Computer Design (ICCD), pp. 13-18, 2007.
This paper presents an FPGA based hardware design for Full Search Block Matching (FSBM) based Motion
Estimation (ME) in video compression. The significantly higher resolution of HDTV based applications
is achieved by using FSBM based ME. The proposed architecture uses a modification of the Sum-of-Absolute-Differences
(SAD) computation in FSBM such that the total number of additions/subtraction operations is drastically
reduced. This successfully optimizes the conflicting design requirements of high throughput and small
silicon area. Comparison results demonstrate the superior performance of our architecture. Finally, the
design of a reconfigurable block matching hardware has been discussed.
[C] Toward memory efficient design of video encoders for multimedia applications
by Avishek Saha, Santosh Ghosh, Shamik Sural, Jayanta Mukherjee
in Proceedings of IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pp. 453-458, May 2007.
In this paper, we suggest a scheme to reduce memory requirement of the Full Search Block Matching
(FSBM) algorithm. Next, we explore the memory design space of embedded multimedia applications and
suggest a speed-energy optimized cache hierarchy.
[C] Data model of echocardiogram video for content based retrieval
by Aditi Roy, V.Pallavi, Avishek Saha, Jayanta Mukhopadhyay, A.K. Majumdar, Shamik Sural
in Proceedings of the Indian Conference on Medical Informatics and Tele-Medicine (ICMIT), 2006.
In this work we propose a new approach for modeling and management of echocardiogram video data.
We introduce a hierarchical state-based model for representing an echo video using objects present
and their dynamic behaviors. At first the echo video is segmented based on 'view', which describes
the presence of objects. Object behavior is described by states and state transition using state
transition diagram. This diagram is used to partition each view segment into states. We apply a novel
technique for detecting the states using Sweep M-mode. For efficient retrieval of information, we propose
indexes based on views and states of objects. The proposed model thus helps to store information about
similar types of video data in a single database schema and supports content based retrieval of video data.
[C] Approximate SAD computation for real-time low-power video encoders
by Avishek Saha, Shamik Sural, Jayanta Mukherjee
in Proceedings of IEE/IET International Conference on Visual Information Engineering (VIE), pp. 207-212, 2006.
This paper presents a heuristic approach to reduce the computational power requirement of SAD cost
function used in the Motion Estimation phase of video encoding. The proposed technique selects the
best matched macroblock based on partial SAD sums. This results in lower power consumption with
marginal loss in image quality. Different levels to trade-off image quality with computational power
have also been proposed. The mathematical intuition in support of the heuristics is discussed. Our
approach shows good performance, especially at low bitrates.
In this paper, we propose an online multitask learning framework where the weight
vectors are updated in an adaptive fashion based on inter-task relatedness. Our work
in is contrast with earlier work on online multitask learning
where the authors use a fixed interaction matrix of tasks to derive (fixed) update rules for all the tasks.
In this work, we propose to update this interaction matrix itself in an adaptive fashion so that the
weight vector updates are no longer fixed but are instead adaptive. Our framework can be extended
to an active learning setting where the informativeness of an incoming instance across all the tasks can be
evaluated using this adaptive interaction matrix. Empirical results on standardized datasets
show improved performance in terms of accuracy, label complexity and number of mistakes made.
*[W] Frustratingly Easy Semi-supervised Domain Adaptation
by Hal Daumé III, Abhishek Kumar, Avishek Saha
at ACL 2010 Workshop on Domain Adaptation for Natural Language Processing (DANLP), 2010.
In this work, we propose a semi-supervised extension to a well-known
supervised domain adaptation approach (EA)~\cite{daume07easyadapt}.
Our proposed approach (EA++) builds on the notion of augmented
space (introduced in EA) and harnesses unlabeled data in target
domain to ameliorate the transfer of information from \emph{source} to \emph{target}.
This semi-supervised approach to domain adaptation is extremely simple to implement, and can
be applied as a pre-processing step to any supervised learner.
Experimental results on sequential labeling tasks demonstrate the efficacy
of the proposed method.
In this work, we show how active learning in some (target) domain can leverage information
from a different but related (source) domain. We present an algorithm that harnesses
the source domain data to learn the best possible initializer hypothesis for doing active
learning in the target domain, resulting in improved label complexity. We also present a
variant of this algorithm which additionally uses the domain divergence information to selectively
query the most informative points in the target domain, leading to further reductions
in label complexity. Experimental results on a variety of datasets establish the efficacy
of the proposed methods.
In distributed learning, the goal is to perform a learning task over data distributed across multiple nodes with minimal (expensive) communication. Prior work (Daume III et al., 2012) proposes a general model that bounds the communication required for learning classifiers while allowing for $\eps$ training error on linearly separable data adversarially distributed across nodes.
In this work, we develop key improvements and extensions to this basic model. Our first result is a two-party multiplicative-weight-update based protocol that uses $O(d^2 \log{1/\eps})$ words of communication to classify distributed data in arbitrary dimension $d$, $\eps$-optimally. This readily extends to classification over $k$ nodes with $O(kd^2 \log{1/\eps})$ words of communication. Our proposed protocol is simple to implement and is considerably more efficient than baselines compared, as demonstrated by our empirical results.
In addition, we illustrate general algorithm design paradigms for doing efficient learning over distributed data. We show how to solve fixed-dimensional and high dimensional linear programming efficiently in a distributed setting where constraints may be distributed across nodes. Since many learning problems can be viewed as convex optimization problems where constraints are generated by individual points, this models many typical distributed learning scenarios. Our techniques make use of a novel connection from multipass streaming, as well as adapting the multiplicative-weight-update framework more generally to a distributed setting. As a consequence, our methods extend to the wide range of problems solvable using these techniques.
We consider the problem of learning classifiers for labeled data that has been distributed across several nodes. Our goal is to find a single classifier, with small approximation error, across all datasets while minimizing the communication between nodes. This setting models real-world communication bottlenecks in the processing of massive distributed datasets. We present several very general sampling-based solutions as well as some two-way protocols which have a provable exponential speed-up over any one-way protocol. We focus on core problems for noiseless data distributed across two or more nodes. The techniques we introduce are reminiscent of active learning, but rather than actively probing labels, nodes actively communicate with each other, each node simultaneously learning the important data from another node.
[J] A Neighborhood Elimination Approach for Block Matching in Motion Estimation
by Avishek Saha, Jayanta Mukherjee, Shamik Sural
at Signal Processing: Image Communication (To appear), 2011.
A new algorithm has been proposed for reducing the number of search locations in blockmatching
based motion estimation. This algorithm uses spatial correlation to eliminate neighboring
blocks having low probability of being best match to the candidate block. Existing fast BMAs use a fixed
pattern to find the motion vector of a macroblock. On the contrary, the proposed algorithm is
independent of any such initially fixed search patterns. The decision to eliminate the neighborhood is
taken dynamically based on a preset threshold Th. The extent to which the neighborhood can be
eliminated is configured using the shift parameter (\delta). Thus, reduction in the number of search
positions changes dynamically depending on input content. Experiments have been carried out for
comparing the performance of the proposed algorithm with other existing BMAs. In addition, an
Adaptive Neighborhood Elimination Algorithm (ANEA) has been proposed whereby the Th and \delta
parameters are updated adaptively.
In this paper, we harness the synergy between two important
learning paradigms, namely, active learning and domain adaptation. We
show how active learning in a target domain can leverage information
from a different but related source domain. Our proposed framework, Active
Learning Domain Adapted (Alda), uses source domain knowledge
to transfer information that facilitates active learning in the target domain.
We propose two variants of Alda: a batch B-Alda and an online
O-Alda. Empirical comparisons with numerous baselines on real-world
datasets establish the efficacy of the proposed methods.
[C] Online Learning of Multiple Tasks and Their Relationships
by Avishek Saha, Piyush Rai, Hal Daumé III, Suresh Venkatasubramanian
at International Conference on Artificial Intelligence and Statistics (AISTATS), 2011.
We propose an Online MultiTask Learning (Omtl) framework which simultaneously
learns the task weight vectors as well as the task relatedness adaptively from the
data. Our work is in contrast with prior work on online multitask learning which
assumes fixed task relatedness, a priori. Furthermore, whereas prior work in such
settings assume only positively correlated tasks, our framework can capture negative
correlations as well. Our proposed framework learns the task relationship matrix
by framing the objective function as a Bregman divergence minimization problem
for positive definite matrices. Subsequently, we exploit this adaptively learned
task-relationship matrix to select the most informative samples in an online multitask
active learning setting. Experimental results on a number of real-world datasets
and comparisons with numerous baselines establish the efficacy of our proposed approach.
*[C] Co-regularization Based Semi-supervised Domain Adaptation
by Hal Daumé III, Abhishek Kumar, Avishek Saha
at Advances in Neural Information Processing Systems (NIPS), 2010.
This paper presents a co-regularization based approach to
semi-supervised domain adaptation. Our proposed approach (EA++) builds
on the notion of augmented space (introduced in EASYADAPT (EA) [1])
and harnesses unlabeled data in target domain to further enable the
transfer of information from source to target. This semi-supervised
approach to domain adaptation is extremely simple to implement and can
be applied as a pre-processing step to any supervised learner. Our
theoretical analysis (in terms of Rademacher complexity) of EA and
EA++ show that the hypothesis class of EA++ has lower complexity
(compared to EA) and hence results in tighter generalization bounds.
Experimental results on sentiment analysis tasks reinforce our
theoretical findings and demonstrate the efficacy of the proposed
method when compared to EA as well as a few other baseline approaches.
In this paper, we propose an online multitask learning framework where the weight
vectors are updated in an adaptive fashion based on inter-task relatedness. Our work
in is contrast with earlier work on online multitask learning
where the authors use a fixed interaction matrix of tasks to derive (fixed) update rules for all the tasks.
In this work, we propose to update this interaction matrix itself in an adaptive fashion so that the
weight vector updates are no longer fixed but are instead adaptive. Our framework can be extended
to an active learning setting where the informativeness of an incoming instance across all the tasks can be
evaluated using this adaptive interaction matrix. Empirical results on standardized datasets
show improved performance in terms of accuracy, label complexity and number of mistakes made.
*[W] Frustratingly Easy Semi-supervised Domain Adaptation
by Hal Daumé III, Abhishek Kumar, Avishek Saha
at ACL 2010 Workshop on Domain Adaptation for Natural Language Processing (DANLP), 2010.
In this work, we propose a semi-supervised extension to a well-known
supervised domain adaptation approach (EA)~\cite{daume07easyadapt}.
Our proposed approach (EA++) builds on the notion of augmented
space (introduced in EA) and harnesses unlabeled data in target
domain to ameliorate the transfer of information from \emph{source} to \emph{target}.
This semi-supervised approach to domain adaptation is extremely simple to implement, and can
be applied as a pre-processing step to any supervised learner.
Experimental results on sequential labeling tasks demonstrate the efficacy
of the proposed method.
In this work, we show how active learning in some (target) domain can leverage information
from a different but related (source) domain. We present an algorithm that harnesses
the source domain data to learn the best possible initializer hypothesis for doing active
learning in the target domain, resulting in improved label complexity. We also present a
variant of this algorithm which additionally uses the domain divergence information to selectively
query the most informative points in the target domain, leading to further reductions
in label complexity. Experimental results on a variety of datasets establish the efficacy
of the proposed methods.
We study sequential dependencies that express the semantics of data with ordered domains
and help identify quality problems with such data. Given an interval g, we write {X}->(g){Y} to
denote that the difference between the Y-attribute values of any two consecutive records,
when sorted on X, must be in g. For example, {time}->(0,\inf){sequence number} indicates
that sequence numbers are strictly increasing over time, whereas {sequence number}->(4,5){time}
means that the time "gaps" between consecutive sequence numbers are between 4 and 5.
Sequential dependencies express relationships between ordered attributes,
and identify missing (gaps too large), extraneous (gaps too small) and out-of-order data.
To make sequential dependencies applicable to real-world data, we relax their requirements
and allow them to hold approximately (with some exceptions) and conditionally (on various subsets
of the data). This paper proposes the notion of conditional approximate sequential dependencies
and provides an efficient framework for discovering pattern tableaux, which are compact representations
of the subsets of the data (i.e., ranges of values of the ordered attributes) that satisfy the
underlying dependency. We present analyses of our proposed algorithms, and experiments on real
data demonstrating the efficiency and utility of our framework.
When merging data from various sources, it is often the case that small variations in data format
and interpretation cause traditional functional dependencies (FDs) to be violated, without there
being an intrinsic violation of semantics. Examples include differing address formats, or different
reported latitude/longitudes for a given address. In this paper, we define metric functional dependencies,
which strictly generalize traditional FDs by allowing small differences (controlled by a metric) in values
of the consequent attribute of an FD. We present efficient algorithms for the verification problem:
determining whether a given metric FD holds for a given relation. We experimentally demonstrate the
validity and efficiency of our approach on various data sets that lie in multidimensional spaces.
[J] New pixel-decimation patterns for block matching in motion estimation
by Avishek Saha, Jayanta Mukherjee, Shamik Sural
in Signal Processing: Image Communication, Vol. 23, No. 10, pp. 725-738, Nov 2008.
This paper presents a boundary-based approach towards pixel decimation with applications in
block-matching algorithms (BMAs). The proposed approach is based on the observation that new
objects usually enter macroblocks (MBs) through their boundaries. The MBs are selected based
on boundary region matching only. The boundary-based patterns can be used to speed up motion
estimation with marginal loss in image quality. Different decimation levels for image quality
trade-off with computational power have been presented. The mathematical intuition in support
of the proposed patterns has been discussed. Apart from the boundary-based approach, the novelty
in our contribution also lies in performing a genetic algorithm (GA)-based search to find optimal
M-length patterns in an NxN block. The resultant patterns are found to have better values of spatial
homogeneity and directional coverage metrics, as compared to the recently proposed N-queen decimation
lattices. Subsequently, we obtain new pixel-decimation patterns by combining the proposed boundary-based
patterns with N-queen patterns and the GA-based patterns. Experimental results demonstrate considerably
improved coding efficiency and comparable prediction quality of these new patterns as compared to existing
decimation lattices.
[C] Memory design and exploration for low-power embedded applications: A case study on hash algorithms
by Debojyoti Bhattacharya, Avishek Saha
in Proceedings of the 15th International Conference on Advanced Computing and Communications (ADCOM), IEEE, pp. 479-484, 2007.
Constraints imposed on various resources in embedded computing make it a challenging design space. One such
important constraint is memory. Proper cache design can overcome this memory bottleneck. In our paper, we propose
a methodology for cache design space exploration specific to cryptographic hash functions. The proposed methodology
finds a speed-power optimized cache configuration. We also describe the experimental procedure towards formulation
of the proposed exploration algorithm. Experiments are performed on two cryptographic hash functions,
namely, SHA-1 and MD5. Our approach tries to reduce the exploration search space and hence is better than
traditional exhaustive search.
[C] Toward optimal pixel decimation patterns for block matching in motion estimation
by Avishek Saha
in Proceedings of the 15th International Conference on Advanced Computing and Communications (ADCOM), IEEE, pp. 635-640, 2007.
In this paper, we present some results on the use of N-queen sub-sampling lattices for motion estimation
in H.264. For MPEG-4, N-queen has been shown to give better results compared to other existing pixel
decimation lattices in terms of spatial homogeneity and directional coverage. We aim to develop a generalized
algorithm to select an M-length pattern from an NxN block such that the selected pattern is optimal with respect
to the aforementioned metrics of spatial homogeneity and directional coverage. In the process, we observe and
present a few interesting pixel decimation patterns that might be useful for the purpose of motion estimation.
[C] A speed-area optimization of full search block matching hardware with applications in high-definition tvs (HDTV)
by Avishek Saha, Santosh Ghosh
in Proceedings of High Performance Computing (HiPC), LNCS, Vol. 4873, pp. 83-94, 2007.
HDTV based applications require FSBM to maintain its significantly higher resolution than traditional
broadcasting formats (NTSC, SECAM, PAL). This paper proposes some techniques to increase the speed and
reduce the area requirements of an FSBM hardware. These techniques are based on modifications of the
Sum-of-Absolute-Differences (SAD) computation and the MacroBlock (MB) searching strategy. The design of
an FSBM architecture based on the proposed approaches has also been outlined. The highlight of the proposed
architecture is its split pipelined design to facilitate parallel processing of macroblocks (MBs) in the
initial stages. The proposed hardware has high throughput, low silicon area and compares favorably with
other existing FPGA architectures.
[C] Speed-area optimized fpga implementation for full search block matching
by Santosh Ghosh, Avishek Saha
in Proceedings of IEEE International Conference on Computer Design (ICCD), pp. 13-18, 2007.
This paper presents an FPGA based hardware design for Full Search Block Matching (FSBM) based Motion
Estimation (ME) in video compression. The significantly higher resolution of HDTV based applications
is achieved by using FSBM based ME. The proposed architecture uses a modification of the Sum-of-Absolute-Differences
(SAD) computation in FSBM such that the total number of additions/subtraction operations is drastically
reduced. This successfully optimizes the conflicting design requirements of high throughput and small
silicon area. Comparison results demonstrate the superior performance of our architecture. Finally, the
design of a reconfigurable block matching hardware has been discussed.
[J] SKIP Prediction for Fast Rate Distortion Optimization in H.264
by Avishek Saha, Kallol Mallick, Jayanta Mukherjee, Shamik Sural
in IEEE Transactions on Consumer Electronics, Vol. 53, No. 3, pp. 1153-1160, Aug 2007.
In H.264, the optimal coding mode for each macroblock (MB) is selected by exhaustively searching
all MB modes in the multiple reference frames. The exhaustive mode search has a very high computational
complexity. This paper proposes three different approaches along with their combination to reduce the
complexity of the rate-distortion optimized mode decision process in H.264. These approaches are based
on transform-domain properties and spatio-temporal correlation in video sequences. Experimental results
demonstrate that the proposed methods achieve around 5-7 times improvement in average speedup with good
prediction quality and comparable bitrate.
[C] Toward memory efficient design of video encoders for multimedia applications
by Avishek Saha, Santosh Ghosh, Shamik Sural, Jayanta Mukherjee
in Proceedings of IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pp. 453-458, 2007.
In this paper, we suggest a scheme to reduce memory requirement of the Full Search Block Matching
(FSBM) algorithm. Next, we explore the memory design space of embedded multimedia applications and
suggest a speed-energy optimized cache hierarchy.
[C] Data model of echocardiogram video for content based retrieval
by Aditi Roy, V.Pallavi, Avishek Saha, Jayanta Mukhopadhyay, A.K. Majumdar, Shamik Sural
in Proceedings of the Indian Conference on Medical Informatics and Tele-Medicine (ICMIT), 2006.
In this work we propose a new approach for modeling and management of echocardiogram video data.
We introduce a hierarchical state-based model for representing an echo video using objects present
and their dynamic behaviors. At first the echo video is segmented based on 'view', which describes
the presence of objects. Object behavior is described by states and state transition using state
transition diagram. This diagram is used to partition each view segment into states. We apply a novel
technique for detecting the states using Sweep M-mode. For efficient retrieval of information, we propose
indexes based on views and states of objects. The proposed model thus helps to store information about
similar types of video data in a single database schema and supports content based retrieval of video data.
[C] Approximate SAD computation for real-time low-power video encoders
by Avishek Saha, Shamik Sural, Jayanta Mukherjee
in Proceedings of IEE/IET International Conference on Visual Information Engineering (VIE), pp. 207-212, 2006.
This paper presents a heuristic approach to reduce the computational power requirement of SAD cost
function used in the Motion Estimation phase of video encoding. The proposed technique selects the
best matched macroblock based on partial SAD sums. This results in lower power consumption with
marginal loss in image quality. Different levels to trade-off image quality with computational power
have also been proposed. The mathematical intuition in support of the heuristics is discussed. Our
approach shows good performance, especially at low bitrates.