Skip to main content

A Near Optimal Policy for Channel Allocation in Cognitive Radio

  • Conference paper
Recent Advances in Reinforcement Learning (EWRL 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5323))

Included in the following conference series:

Abstract

Several tasks of interest in digital communications can be cast into the framework of planning in Partially Observable Markov Decision Processes (POMDP). In this contribution, we consider a previously proposed model for a channel allocation task and develop an approach to compute a near optimal policy. The proposed method is based on approximate (point based) value iteration in a continuous state Markov Decision Process (MDP) which uses a specific internal state as well as an original discretization scheme for the internal points. The obtained results provide interesting insights into the behavior of the optimal policy in the channel allocation model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cassandra, A., Littman, M., Zhang, N., et al.: Incremental pruning: A simple, fast, exact method for partially observable Markov decision processes. In: Proceedings of Thirteenth Conference on Uncertainty in Artificial Intelligence, pp. 54–61 (1997)

    Google Scholar 

  2. Meuleau, N., Kim, K., Kaelbling, L., Cassandra, A.: Solving POMDPs by searching the space of finite policies. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 417–426 (1999)

    Google Scholar 

  3. Aberdeen, D.: Policy-Gradient Algorithms for Partially Observable Markov Decision Processes. Ph.D thesis, Australian National University (2003)

    Google Scholar 

  4. Pineau, J., Gordon, G., Thrun, S.: Anytime point-based approximations for large POMDPs. Journal of Artificial Intelligence Research 27, 335–380 (2006)

    MATH  Google Scholar 

  5. Astrom, K.: Optimal control of Markov decision processes with incomplete state estimation. Journal of Mathematical Analysis and Applications 10, 174–205 (1965)

    Article  MathSciNet  MATH  Google Scholar 

  6. Sondik, E.: The Optimal Control of Partially Observable Markov Processes Over the Infinite Horizon: Discounted Costs. Operations Research 26(2), 282–304 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  7. Kaelbling, L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101(1), 99–134 (1996)

    MathSciNet  MATH  Google Scholar 

  8. Zhao, Q., Tong, L., Swami, A.: Decentralized cognitive MAC for dynamic spectrum access. In: Proc. First IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks, pp. 224–232 (2007)

    Google Scholar 

  9. Whittle, P.: Restless Bandits: Activity Allocation in a Changing World. Journal of Applied Probability 25, 287–298 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  10. Papadimitriou, C., Tsitsiklis, J.: The complexity of optimal queueing network control. In: Proceedings of the Ninth Annual Structure in Complexity Theory Conference, pp. 318–322 (1994)

    Google Scholar 

  11. Le Ny, J., Dahleh, M., Feron, E.: Multi-UAV Dynamic Routing with Partial Observations using Restless Bandits Allocation Indices. LIDS, Massachusetts Institute of Technology, Tech. Rep. (2007)

    Google Scholar 

  12. Guha, S., Munagala, K.: Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards. In: 48th Annual IEEE Symposium on Foundations of Computer Science, 2007. FOCS 2007, pp. 483–493 (2007)

    Google Scholar 

  13. Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York (1994)

    Book  MATH  Google Scholar 

  14. Bonet, B.: An e-optimal grid-based algorithm for partially observable Markov decision processes. In: 19th International Conference on Machine Learning, Sydney, Australia (June 2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Filippi, S., Cappé, O., Clérot, F., Moulines, E. (2008). A Near Optimal Policy for Channel Allocation in Cognitive Radio. In: Girgin, S., Loth, M., Munos, R., Preux, P., Ryabko, D. (eds) Recent Advances in Reinforcement Learning. EWRL 2008. Lecture Notes in Computer Science(), vol 5323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89722-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89722-4_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89721-7

  • Online ISBN: 978-3-540-89722-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics