期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

全文获取类型

收费全文	91篇
免费	41篇
国内免费	30篇

专业分类

航空	84篇
航天技术	19篇
综合类	32篇
航天	27篇

出版年

2024年	2篇
2023年	12篇
2022年	16篇
2021年	21篇
2020年	11篇
2019年	3篇
2018年	6篇
2017年	4篇
2016年	2篇
2015年	3篇
2014年	1篇
2013年	4篇
2012年	4篇
2011年	2篇
2010年	3篇
2009年	5篇
2008年	8篇
2007年	7篇
2006年	8篇
2005年	3篇
2004年	5篇
2003年	1篇
2002年	4篇
2001年	1篇
2000年	2篇
1999年	2篇
1997年	2篇
1996年	2篇
1995年	1篇
1994年	4篇
1993年	3篇
1991年	3篇
1990年	2篇
1989年	1篇
1988年	1篇
1987年	2篇
1986年	1篇

排序方式： 共有162条查询结果，搜索用时 93 毫秒

[首页] « 上一页 [8] [9] [10] [11] [12] [13] [14] [15] 16 [17] 下一页 » 末页»

151.

Imaginary filtered hindsight experience replay for UAV tracking dynamic targets in large-scale unknown environments

《中国航空学报》2023,36(5):377-391

As an advanced combat weapon, Unmanned Aerial Vehicles (UAVs) have been widely used in military wars. In this paper, we formulated the Autonomous Navigation Control (ANC) problem of UAVs as a Markov Decision Process (MDP) and proposed a novel Deep Reinforcement Learning (DRL) method to allow UAVs to perform dynamic target tracking tasks in large-scale unknown environments. To solve the problem of limited training experience, the proposed Imaginary Filtered Hindsight Experience Replay (IFHER) generates successful episodes by reasonably imagining the target trajectory in the failed episode to augment the experiences. The well-designed goal, episode, and quality filtering strategies ensure that only high-quality augmented experiences can be stored, while the sampling filtering strategy of IFHER ensures that these stored augmented experiences can be fully learned according to their high priorities. By training in a complex environment constructed based on the parameters of a real UAV, the proposed IFHER algorithm improves the convergence speed by 28.99% and the convergence result by 11.57% compared to the state-of-the-art Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the IFHER agent. Moreover, the flight trajectory of the IFHER agent shows the superiority of the learned policy and the practical application value of the algorithm. 相似文献

152.

一种无人机自主避障与目标追踪方法

下载免费PDF全文

江未来徐国强王耀南《宇航学报》2022,43(6):802-810

针对无人机自主避障与目标追踪问题，以深度Q网络(DQN)算法为基础，提出一种多经验池深度Q网络(MP DQN)算法，使无人机避障与追踪的成功率和算法的收敛性得到优化。更进一步，赋予无人机环境感知能力，并在奖励函数中设计了方向奖惩函数，提升了无人机对环境的泛化能力以及算法的整体性能。仿真结果表明，相较于DQN和双重DQN(DDQN)算法，MP DQN算法具有更快的收敛速度、更短的追踪路径和更强的环境适应性。相似文献

153.

考虑移动目标不确定行为方式的轨迹预测方法

下载免费PDF全文

颜鹏郭继峰白成超《宇航学报》2022,43(8):1040-1051

针对现有方法难以预测出符合飞行移动目标不确定行为方式轨迹的问题,提出基于逆强化学习的飞行移动目标轨迹预测方法,通过学习目标行为偏好以及模拟目标行为决策过程的方式预测目标的移动轨迹。首先基于深度神经网络建立目标的行为决策模型与行为偏好模型,然后通过最大熵逆强化学习方法交替地学习模型参数。为了有效地学习目标的不确定行为特征,采用监督学习的方法学习出目标示例轨迹概率分布模型,用于指导目标行为偏好模型的训练以及初始化目标行为决策模型,同时通过对目标行为偏好模型进行预训练的方式提高其训练质量。仿真结果表明,提出的飞行移动目标轨迹预测方法可通过学习到的目标行为决策模型较为准确地模拟目标的行为方式,预测的目标轨迹分布与真实的目标轨迹分布在Kullback Leibler(KL)散度下的相似度可达0.24。相似文献

154.

边缘网络下多无人机协同计算和资源分配联合优化策略

郭永安王宇翱周沂房晶晶任保全《南京航空航天大学学报》2023,55(5):757-767

针对边缘网络环境下多人机之间存在计算负载不均，造成卸载任务失败的问题，提出了一种多无人机间协作的智能任务卸载方案。通过联合考虑多无人机任务分配、计算资源分配和无人机飞行轨迹，引入公平性指数建立了无人机公平负载最大化和能量消耗最小化问题。基于多智能体深度强化学习框架，提出了融合轨迹规划和任务卸载的分布式算法。仿真结果表明，所提出的多无人机协作方案可以显著提高任务完成率和负载公平度，并且有效适用于大规模用户设备场景。相似文献

155.

Locally generalised multi-agent reinforcement learning for demand and capacity balancing with customised neural networks

《中国航空学报》2023,36(4):338-353

Reinforcement Learning (RL) techniques are being studied to solve the Demand and Capacity Balancing (DCB) problems to fully exploit their computational performance. A locally generalised Multi-Agent Reinforcement Learning (MARL) for real-world DCB problems is proposed. The proposed method can deploy trained agents directly to unseen scenarios in a specific Air Traffic Flow Management (ATFM) region to quickly obtain a satisfactory solution. In this method, agents of all flights in a scenario form a multi-agent decision-making system based on partial observation. The trained agent with the customised neural network can be deployed directly on the corresponding flight, allowing it to solve the DCB problem jointly. A cooperation coefficient is introduced in the reward function, which is used to adjust the agent’s cooperation preference in a multi-agent system, thereby controlling the distribution of flight delay time allocation. A multi-iteration mechanism is designed for the DCB decision-making framework to deal with problems arising from non-stationarity in MARL and to ensure that all hotspots are eliminated. Experiments based on large-scale high-complexity real-world scenarios are conducted to verify the effectiveness and efficiency of the method. From a statistical point of view, it is proven that the proposed method is generalised within the scope of the flights and sectors of interest, and its optimisation performance outperforms the standard computer-assisted slot allocation and state-of-the-art RL-based DCB methods. The sensitivity analysis preliminarily reveals the effect of the cooperation coefficient on delay time allocation. 相似文献

156.

基于深度强化学习的四足机器人后空翻动作生成方法

下载免费PDF全文

李岸荞王志成古勇吴俊朱秋国《导航定位与授时》2021,8(6):35-42

四足机器人灵巧运动技能的生成一直受到机器人研究者们的广泛关注,其中空中翻滚运动既能展现四足机器人运动的灵活性又具有一定的实用价值.近年来,深度强化学习方法为四足机器人的灵巧运动提供了新的实现思路,利用该方法得到的闭环神经网络控制器具有适应性强、稳定性高等特点.本文在绝影Lite机器人上使用基于模仿专家经验的深度强化学习方法,实现了仿真环境中四足机器人的后空翻动作学习,并进一步证明了设计的后空翻闭环神经网络控制器相比于开环传统位置控制器具有适应性更高的特点. 相似文献

157.

拦截大气层内机动目标的深度强化学习制导律

下载免费PDF全文

邱潇颀高长生荆武兴《宇航学报》2022,43(5):685-695

针对大气层内高速机动目标的拦截问题，提出了一种基于双延迟深度确定性策略梯度(TD3)算法的深度强化学习制导律，它直接将交战状态信息映射为拦截弹的指令加速度，是一种端到端、无模型的制导策略。首先，将攻防双方的交战运动学模型描述为适用于深度强化学习算法的马尔科夫决策过程，之后通过合理地设计算法训练所需的交战场景、动作空间、状态空间和网络结构，并引入奖励函数整形和状态随机初始化，构建了完整的深度强化学习制导算法。仿真结果表明：与比例导引和增强比例导引两种方案相比，深度强化学习制导策略在脱靶量更小的同时能够降低对中制导精度的要求；具有良好的鲁棒性和泛化能力，并且计算负担较小,具备在弹载计算机上运行的条件。相似文献

158.

基于强化学习的航天器姿态控制器设计

下载免费PDF全文

张瑞卿钟睿徐毅《上海航天》2023,40(1):80-85

航天器在轨执行某些任务时，其质量参数会发生未知变化，传统控制方法在这种情况下控制效果不佳。本文提出基于强化学习的航天器姿态控制器设计方法，该方法在姿态控制器训练过程中不需要对航天器进行动力学建模，不依赖航天器的质量参数。当质量参数发生较大未知变化时，训练好的控制器仍然可以保持较好的控制效果。仿真测试表明:使用基于强化学习方法训练的控制器确实具有良好的鲁棒性。此外，回报函数的设计会明显影响姿态控制器的训练，因此对不同的回报函数设计进行了研究。相似文献

159.

Sequential dynamic resource allocation in multi-beam satellite systems:A learning-based optimization method

Yixin HUANG Shufan WU Zhankui ZENG Zeyu KANG Zhongcheng MU Hai HUANG 《中国航空学报》2023,36(6):288-301

Multi-beam antenna and beam hopping technologies are an effective solution for scarce satellite frequency resources. One of the primary challenges accompanying with Multi-Beam Satellites(MBS) is an efficient Dynamic Resource Allocation(DRA) strategy. This paper presents a learning-based Hybrid-Action Deep Q-Network(HADQN) algorithm to address the sequential decision-making optimization problem in DRA. By using a parameterized hybrid action space,HADQN makes it possible to schedule the beam patte... 相似文献

160.

航天器反应式碎片规避动作规划方法

下载免费PDF全文

吴健发魏春岭张海博《宇航学报》2023,44(2):221-232

提出一种航天器反应式碎片规避动作规划方法，首先以扰动流体动态系统(IFDS)算法作为动作规划的基础算法，通过其中的总和扰动矩阵对航天器的轨道速度矢量进行修正，实现轨道机动规避；然后，建立基于双延迟深度确定性策略梯度(TD3)深度强化学习算法的反应式动作规划方法，通过TD3在线优化IFDS规划参数，实现对碎片群的“状态-动作”最优、快速规避决策。在此基础上，将优先级经验回放和渐进式学习策略引入该方法中，提升训练效率。最后，仿真结果表明，所提方法可使航天器安全规避多发、突发、动态且形状各异的空间碎片群，且具有较好的实时性。相似文献

[首页] « 上一页 [8] [9] [10] [11] [12] [13] [14] [15] 16 [17] 下一页 » 末页»