Abstract
For the on-orbit reconfiguration problem of spacecraft attitude control systems under multi-mission constraints, the idea of a reinforcement-learning algorithm is adopted, and an adaptive dynamic programming algorithm for on-orbit reconfiguration decision-making that is based on a dual optimization index is proposed. Two optimization objectives, total mission reward and total control cost (energy consumption), are defined to obtain the optimal reconfiguration policy of the spacecraft attitude control system reconfiguration, and the on-orbit reconfiguration model for multi-mission constraints is established. Then, based on the Bellman optimality principle, the optimal reconfiguration policy formulated by the discrete HJB equation is obtained. Since the HJB equation is difficult to solve accurately, a method of bi-objective adaptive dynamic programming is proposed to obtain the optimal reconfiguration policy. This method constructs a mission network and an energy network. The method then adopts a Q-learning-based algorithm to train the networks to estimate the values of total mission reward and total control cost to achieve the on-orbit optimal reconfiguration decision under multi-mission constraints. Simulation results for different cases demonstrate the validity and rationality of the proposed method.
Similar content being viewed by others
References
L. Jiang, H. Li, and G. Yang, “A survey of spacecraft autonomous fault diagnosis research,” Journal of Astronautics, vol. 30, no. 4, pp. 1320–1326, 2009.
Y. Xing, H. Wu, and X. Wang, “Survey of fault diagnosis and fault-tolerance control technology for spacecraft,” Journal of Astronautics, vol. 24, no. 3, pp. 221–226, 2003.
S. Yin, B. Xiao, S. Ding, and D. Zhou, “A review on recent development of spacecraft attitude fault tolerant control system,” IEEE Transactions on Industrial Electronics, vol. 63, no. 5, pp. 3311–3320, 2016.
W. Fan, Y. Cheng, and B. Jiang, “Reconfigurability analysis for satellite attitude control systems,” Journal of Astronautics, vol. 35, no. 2, pp. 185–191, 2014.
Y. Cheng, B. Jiang, and Y. Fu, “Robust observer based reliable control for satellite attitude control systems with sensor faults,” International Journal of Innovative Computing, Information and Control, vol. 7, no. 7, pp. 4149–4160, 2011.
R. Houimli, N. Bedioui, and M. Besbes, “An improved polytopic adaptive LPV observer design under actuator fault,” International Journal of Control Automation & Systems, vol. 16, no. 1, pp. 168–180, 2018.
J. Liang, Q. Wang, and C. Y Dong, “An adaptive fuzzy estimator-based satellite fault-tolerant control system,” Journal of Astronautics, vol. 31, no. 8, pp. 1970–1975, 2010.
H. Talebi and R. Patel, “An intelligent fault detection and recovery scheme for reaction wheel actuator of satellite attitude control systems,” IEEE International Conference on Control Applications, pp. 3282–3287, 2006.
Y. Ma, B. Jiang, G. Tao, and Y. Cheng, “Actuator failure compensation and attitude control for rigid satellite by adaptive control using quaternion feedback,” Journal of the Franklin Institute, vol. 351, no. 1, pp. 296–314, 2014.
D. Bustan, S. K. H. Sani, and N. Pariz, “Retracted atricle: immersion and invariance based fault tolerant adaptive spacecraft attitude control,” International Journal of Control Automation & Systems, vol. 12, no. 2, pp. 333–339, 2014.
Q. Shen, D. Wang, S. Zhu, and E. Poh, “Integral-type sliding mode fault-tolerant control for attitude stabilization of spacecraft,” IEEE Transactions on Control Systems Technology, vol. 23, no. 3, pp. 1131–1138, 2015.
H. Gui and G. Vukovich, “Adaptive fault-tolerant spacecraft attitude control using a novel integral terminal sliding mode,” International Journal of Robust & Nonlinear Control, vol. 27, no. 16, 2017.
Q. Hu, G. Niu, and C. Wang, “Spacecraft attitude faulttolerant control based on iterative learning observer and control allocation,” Aerospace Science & Technology, 2008.
F. Li, C. Du, W. Yang, and W. Gui, “Passivity-based asynchronous sliding mode control for delayed singular Markovian jump systems,” IEEE Transactions on Automatic Control, vol. 63, no. 8, pp. 2715–2721, August 2018.
C. Du, C. Yang, F. Li, and W. Gui, “A novel asynchronous control for artificial delayed Markovian jump systems via output feedback sliding mode approach,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 49, no. 2, pp. 364–374, Feb 2019.
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed., MIT press Cambridge Massachusetts London, November 2017.
S. Choi, S. Kim, and H. J. Kim, “Inverse reinforcement learning control for trajectory tracking of a multirotor UAV,” International Journal of Control Automation & Systems, vol. 15, no. 4, pp. 1826–1834, 2017.
F. L. Lewis and D. Liu, “Reinforcement learning and approximate dynamic programming for feedback control,” IEEE Circuits & Systems Magazine, vol. 9, no. 3, pp. 32–50, 2015.
V. Mnih, K. Kavukcuoglu, and D. Silver, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
R. E. Bellman and S. E. Dreyfus, Applied Dynamic Programming, Princeton University Press, 2015.
D. Liu and Q. Wei, “Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems,” IEEE Trans on Neural Networks and Learning Systems, vol. 25, no. 3, pp. 621–634, 2014.
D. Liu and Q. Wei, “A new discrete-time iterative adaptive dynamic programming algorithm based on Q-learning,” Proc. of International Symposium on Advances in Neural Networks, pp. 43–52, 2016.
Q. Wei, D. Liu, and Y. Xu, “Neuro-optimal tracking control for a class of discrete-time nonlinear systems via generalized value iteration adaptive dynamic programming approach,” Soft Computing, vol. 20, no. 2, pp. 1–10, 2016.
Q. Lin, Q. Wei, and B. Zhao, “A generalized policy iteration adaptive dynamic programming algorithm for optimal control of discrete-time nonlinear systems with actuator saturation,” Proc. of International Symposium on Neural Networks, Cham, pp. 60–65, 2017.
T. Y. Chun, B. P. Jin, and Y. H. Choi, “Reinforcement Qlearning based on multirate generalized policy iteration and its application to a 2-DOF helicopter,” International Journal of Control Automation & Systems, vol. 16, no. 1, pp. 377–386, 2018.
J. Fu, H. He, and X. Zhou, “Adaptive learning and control for MIMO system based on adaptive dynamic programming,” IEEE Transactions on Neural Networks, vol. 22, no. 7, pp. 1133–1148, 2016.
D. Wang, Y. Tu, and C. Liu, “Connotation and research of reconfigurability for spacecraft control systems: a review,” Acta Automatica Sinica, vol. 43, no. 10, pp. 1687–1702, 2017.
M. Tipaldi and L. Glielmo, “A survey on model-based mission planning and execution for autonomous spacecraft,” IEEE Systems Journal, pp. 1–13, July 2017.
A. Nasir, E. Atkins, and I. Kolmanovsky, “A mission based fault reconfiguration framework for spacecraft applications,” Fertility & Sterility, vol. 86, no. 3, pp. S482-S483, 2012.
A. Nasir, Comprehensive Fault Tolerance and Science-Optimal Attitude Planning for Spacecraft Applications, University of Michigan, 2012.
B. A. Bakar, Autonomous Multi-agent Reconfigurable Control Systems, University of Southampton, Southampton, 2013.
J. Zhu J, G. E. Xinsheng, and M. Wang, “Approximate dynamic programming for attitude control of three-axis satellite,” Journal of Beijing Information Science & Technology University, vol. 33, no. 1, pp. 27–32, 2018.
H. He, Z. Ni, and J. Fu, “A three-network architecture for on-line learning and optimization based on adaptive dynamic programming,” Neurocomputing, vol. 78, no. 1, pp. 3–13, 2012.
Z. Ni, H. He, and J. Wen, “Adaptive learning in tracking control based on the dual critic network design,” IEEE Transactions on Neural Networks&Learning Systems, vol. 24, no. 6, pp. 913–928, 2013.
C. Liu, X. Xu, and D. Hu, “Multiobjective reinforcement learning: a comprehensive overview,” IEEE Transactions on Systems, Man and Cybernetics: Systems, vol. 45, no. 3, pp. 385–398, March 2015.
J. W. Chen, Y. H. Cheng, and B. Jiang, “Missionconstrained spacecraft attitude control system on-orbit reconfiguration algorithm,” Journal of Astronautics, vol. 38, no. 9, pp. 989–997, 2017.
H. Liu, Research on Key Technologies of Microsatellite Attitude Control System, Nanjing University of Aeronautics and Astronautics, 2008.
Author information
Authors and Affiliations
Corresponding author
Additional information
Recommended by Associate Editor Niket Kaisare under the direction of Editor Jay H. Lee. This work is supported by Natural Science Foundation of China (Grant No.61673206), 13th Five-Year Equipment Pre Research Projects of China (Grant No.30501050403), the Fundamental Research Funds for the Central Universities of China (Grant No.NZ2016111), and Science and Technology on Space Intelligent Control Laboratory of Beijing (No. ZDSYS-2017-01).
Yuehua Cheng was born in 1977. She is an associate professor in Nanjing University of Aeronautics and Astronautics. Her research interests include fault tolerant control and reconfiguration and their applications to satellite attitude control systems.
Bin Jiang was born in 1966. He is a professor and dean of College of Automation Engineering in Nanjing University of Aeronautics and Astronautics. He now serves as Associate Editor for IEEE Trans. Control Systems Technology; Int. J. of System Science; Int. J. of Control, Automation and Systems; ACTA AUTOMATICA SINICA, Systems Engineering and Electronics, etc.. His research interests include fault diagnosis and fault tolerant control and their applications.
Huan Li was born in 1993. She is now a graduate student of Nanjing University of Aeronautics and Astronautics. Her research interests are spacecraft attitude control and mission planning of satellite cluster.
Xiaodong Han was born in 1983. He is now a senior engineer in Institute of Telecommunication Satellite China Academy of Space Technology. His current research interests include satellite attitude control, fault detection and diagnosis, fault-tolerant control, and optimization computation.
Rights and permissions
About this article
Cite this article
Cheng, YH., Jiang, B., Li, H. et al. On-orbit Reconfiguration Using Adaptive Dynamic Programming for Multi-mission-constrained Spacecraft Attitude Control System. Int. J. Control Autom. Syst. 17, 822–835 (2019). https://doi.org/10.1007/s12555-018-9308-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12555-018-9308-5