A Near Optimal Policy for Channel Allocation in Cognitive Radio

Filippi, Sarah; Cappé, Olivier; Clérot, Fabrice; Moulines, Eric

doi:10.1007/978-3-540-89722-4_6

Sarah Filippi³,
Olivier Cappé³,
Fabrice Clérot⁴ &
…
Eric Moulines³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5323))

Included in the following conference series:

European Workshop on Reinforcement Learning

1104 Accesses
4 Citations

Abstract

Several tasks of interest in digital communications can be cast into the framework of planning in Partially Observable Markov Decision Processes (POMDP). In this contribution, we consider a previously proposed model for a channel allocation task and develop an approach to compute a near optimal policy. The proposed method is based on approximate (point based) value iteration in a continuous state Markov Decision Process (MDP) which uses a specific internal state as well as an original discretization scheme for the internal points. The obtained results provide interesting insights into the behavior of the optimal policy in the channel allocation model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cassandra, A., Littman, M., Zhang, N., et al.: Incremental pruning: A simple, fast, exact method for partially observable Markov decision processes. In: Proceedings of Thirteenth Conference on Uncertainty in Artificial Intelligence, pp. 54–61 (1997)
Google Scholar
Meuleau, N., Kim, K., Kaelbling, L., Cassandra, A.: Solving POMDPs by searching the space of finite policies. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 417–426 (1999)
Google Scholar
Aberdeen, D.: Policy-Gradient Algorithms for Partially Observable Markov Decision Processes. Ph.D thesis, Australian National University (2003)
Google Scholar
Pineau, J., Gordon, G., Thrun, S.: Anytime point-based approximations for large POMDPs. Journal of Artificial Intelligence Research 27, 335–380 (2006)
MATH Google Scholar
Astrom, K.: Optimal control of Markov decision processes with incomplete state estimation. Journal of Mathematical Analysis and Applications 10, 174–205 (1965)
Article MathSciNet MATH Google Scholar
Sondik, E.: The Optimal Control of Partially Observable Markov Processes Over the Infinite Horizon: Discounted Costs. Operations Research 26(2), 282–304 (1978)
Article MathSciNet MATH Google Scholar
Kaelbling, L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101(1), 99–134 (1996)
MathSciNet MATH Google Scholar
Zhao, Q., Tong, L., Swami, A.: Decentralized cognitive MAC for dynamic spectrum access. In: Proc. First IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks, pp. 224–232 (2007)
Google Scholar
Whittle, P.: Restless Bandits: Activity Allocation in a Changing World. Journal of Applied Probability 25, 287–298 (1988)
Article MathSciNet MATH Google Scholar
Papadimitriou, C., Tsitsiklis, J.: The complexity of optimal queueing network control. In: Proceedings of the Ninth Annual Structure in Complexity Theory Conference, pp. 318–322 (1994)
Google Scholar
Le Ny, J., Dahleh, M., Feron, E.: Multi-UAV Dynamic Routing with Partial Observations using Restless Bandits Allocation Indices. LIDS, Massachusetts Institute of Technology, Tech. Rep. (2007)
Google Scholar
Guha, S., Munagala, K.: Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards. In: 48th Annual IEEE Symposium on Foundations of Computer Science, 2007. FOCS 2007, pp. 483–493 (2007)
Google Scholar
Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York (1994)
Book MATH Google Scholar
Bonet, B.: An e-optimal grid-based algorithm for partially observable Markov decision processes. In: 19th International Conference on Machine Learning, Sydney, Australia (June 2002)
Google Scholar

Download references

Author information

Authors and Affiliations

LTCI, TELECOM ParisTech and CNRS, 46 rue Barrault, 75013, Paris, France
Sarah Filippi, Olivier Cappé & Eric Moulines
France Telecom R&D, 2 avenue Pierre Marzin, 22300, Lannion, France
Fabrice Clérot

Authors

Sarah Filippi
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Cappé
View author publications
You can also search for this author in PubMed Google Scholar
Fabrice Clérot
View author publications
You can also search for this author in PubMed Google Scholar
Eric Moulines
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INRIA Lille-Nord Europe, 59650, Villeneuve d’Ascq, France
Sertan Girgin
INRIA, LIFL, CNRS, Université de Lille, Villeneuve d’Ascq, France
Manuel Loth , Rémi Munos , Philippe Preux & Daniil Ryabko , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Filippi, S., Cappé, O., Clérot, F., Moulines, E. (2008). A Near Optimal Policy for Channel Allocation in Cognitive Radio. In: Girgin, S., Loth, M., Munos, R., Preux, P., Ryabko, D. (eds) Recent Advances in Reinforcement Learning. EWRL 2008. Lecture Notes in Computer Science(), vol 5323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89722-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-540-89722-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89721-7
Online ISBN: 978-3-540-89722-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics